=- "opp S ES *Y 


Ia 


ét 


Stochastic 
Models 


for Learning 


WILEY PUBLICATIONS - 
IN STATISTICS 


Walter A. Shewhart, Editor 


Mathematical Statistics 


CRAMER - The Elements of Probability Theory and Some of Its Applications 
SAVAGE - Foundations of Statistics 
BLACKWELL and GIRSHICK - Theory of Games and Statistical Decisions 
HANSEN, HURWITZ, and MADOW - Sample Survey Methods 
and Theory, Volume II 
DOOB : Stochastic Processes 
RAO - Advanced Statistical Methods in Biometric Research 
KEMPTHORNE - The Design and Analysis of Experiments 
DWYER : Linear Computations 
FISHER - Contributions to Mathematical Statistics 
WALD : Statistical Decision Functions 
FELLER - An Introduction to Probability Theory and Its Applications, Volume 1 
WALD - Sequential Analysis 
HOEL - Introduction to Mathematical Statistics, Second Edition 


Applicd Statistics 


BUSH and MOSTELLER - Stochastic Models for Learning 

FRYER - Elements of Statistics 

BENNETT and FRANKLIN - Statistical Analysis in Chemistry and the 
Chemical Industry 

COCHRAN - Sampling Techniques 

WOLD and JUREEN * Demand Analysis 

HANSEN, HURWITZ, and MADOW - Sample Survey Methods and Theory, Volume I 

CLARK + An Introduction to Statistics 

TIPPETT + The Methods of Statistics, Fourth Edition 

ROMIG - 50-100 Binomial Tables 

GOULDEN - Methods of Statistical Analysis, Second Edition 

HALD - Statistical Theory with Engineering Applications 

HALD - Statistical Tables and Formulas 


YOUDEN - Statistical Methods for Chemists 
MUDGETT - Index Numbers 


TIPPETT - Technological Applications of Statistics 
DEMING - Some Theory of Sampling 

COCHRAN and COX - Experimental Designs 
RICE - Control Charts 

DODGE and ROMIG - Sampling Inspection Tables 


Related Books of Interest to Statisticians 


ALLEN and ELY - International Trade Statistics 
HAUSER and LEONARD - Government Statistics for Business Use 


Mur Jesu SEC eet ~ a 
Bu aaa — pe REO mmu 


2 


Stochastic 
Models 


for Learning 


ROBERT R. BUSH 


Assistant Professor of Social Relations 


FREDERICK MOSTELLER 


Professor of Mathematical Statistics 


DEPARTMENT OF SoctAL RELATIONS 
Hanvanp University 


New York - John Wiley & Sons, Inc. 
London : Chapman & Hall, Limited 


CopyricutT, 1955 
BY 
Jous WiLEYy & Sons, INc. 


3.C.E RT. West Benga) All Rights Reserved 
Date .* = | 5 T MM This book or any part thercof must not 


T. emet y y 
eproduced in any form without 


Acc. No...1 9. &)... adus itty ferhiission of the publisher. 


COPYRIGHT, CANADAM 1955, INTERNATIONAL COPYRIGHT, 1955 
Jous WirEv & Sons, INC., PROPRIETOR 


All Foreign Rights Reserved 


Reproduction in whole or in part forbidden. 


tuna At 
(ien Gao. 


ee a a oe 


t" 


PRINTED IN THE UNITED STATES OF AMERICA 


: 
Š 

o 

P a 
u 
z 
E: 
a 


Preface 


; MATHEMATICS IS OFTEN REGARDED AS A SCIENCE, BUT TO DO SO 
is an error. Science is concerned with the empirical world, whereas 
mathematics is the study of relationships among empirically undefined 
quantities. The bridge between pure mathematics and empirical science 
is the identification of mathematical constructs with observables; when 
such identifications are made we shall say that we have a mathematical 
model of a situation. This book is concerned with such a model. 
Nothing new is offered to the mathematician in his professional capac- 
ity, because no new mathematical methods have been introduced; the 
kinds of mathematics employed are well known and rather well de- 
veloped. The model raises some new problems that have interested 
mathematicians, and they have given us the benefit of their results. 
Usually their proofs require a degree of mathematical sophistication 
beyond the level we have set for this volume. In addition, our model 
brings new problems of statistical estimation, but for the most part we 
have used slight variations on well-known methods to solve these prob- 
lems. In the future, mathematical statisticians may discover new and 
better ways of solving some of the problems we describe. 

To many experimental psychologists a book such as this may appear 
unnecessary because they may feel that answers to the important ques- 
tions of psychology are to be found in the laboratory or in the field and 
in the collection of more and better data, rather than in mathematical 
formulas. We have no quarrel with such views; rather, this book is 
for readers who feel that mathematical analysis may contribute to the 
development of theory in one of the many fields of psychology—learn- 
ing. We have no intention of trying to convert psychologists who feel 
that human behavior cannot be mathematically described. A possible 
mathematical framework for analyzing data from a variety of experi- 
ments on animal and human learning is presented. We have tried to 
give more than a collection of “local” models, each designed to analyze 
one special type of data. The system we describe in rather general 
terms is applied to a number of particular experimental problems, but 
We make no pretense at completeness Or finality and shall feel much 


rewarded if we provide a start on a good approach. 
vii 


viii PREFACE 


We are aware of numerous earlier attempts to construct mathe- 
matical models for learning. The history of such attempts would re- 
quire an extensive monograph, and so we have not attempted to review 
such work in this volume, though some of it can be shown to be closely 
related to our model, notably the work of Hull, Gulliksen and Wolfle, 
Rashevsky, and Thurstone. 

We wish to say a few words about the level of mathematics used in 
this book. For the most part we have tried to write our book for the 
experimental psychologist with only a limited background in mathe- 
matics. We assume that our reader has taken, at one time or another, 
an elementary course in differential and integral calculus and a course 
in applied statistics. Actually, we make relatively little use of calculus, 
but we assume a degree of facility with mathematical manipulations 
characteristic of a person who has completed a course in the subject. 
We will make use of certain kinds of mathematics such as matrix 
algebra and set theory ordinarily studied in more advanced courses, 
but we will present as complete an exposition of these topics as we feel 


is necessary for the analysis given. These expositions occur in the 
text when the needs arise. Because we are writing mainly for readers 


with limited mathematical preparation, we include many more steps 
in the various mathematical developments than we would otherwise. 
Our attempt to include so many equations is to aid the reader rather 
than to hinder him. Frequently we repeat an equation rather than 
force the reader to look back fifty pages. 

Perhaps a word of advice about reading mathematical material will 
not be out of order for the psychologist whose mathematics is rusty but 
who decides he wants to follow certain derivations in detail. Start 
with a fast reading of the sections concerned to get the general orienta- 
tion. Then take paper and pencil and work through each derivation 
step by step. A good understanding of a mathematical development 


often goes with a feeling that one has invented it oneself, and that the 
original authors were somewhat opaque. 


This book is divided into two main parts. In Part I we present the 
general model, describe some of its mathematical properties, and con- 
sider a number of special cases; in Part II we apply the model to a 
number of specific experimental problems in learning and devote con- 
siderable attention to the statistical problems of estimating model 
parameters and measuring goodness of fit. Not all the mathematical 
machinery developed in Part I is used in the applications of Part II. 
Our goal in Part I extends beyond the derivation of formulas to be used 
in analyzing data; we have attempted to subject the general model to 


PREFACE ix 


careful mathematical analysis in order to study its properties. We feel 
that this study is important for making decisions about where and how 
to apply the model. 

Various methods may be used in reading this book, depending on 
the goal of tlie reader and his particular background in mathematics. 
We have starred many sections which are mainly mathematical and 
unnecessary for the main development. An experimental psychologist 
who is reading this book only to make a general evaluation can profit- 
ably omit the starred sections or, in fact, may omit or merely scan 
most of Part I. We do consider Chapters 1 and 3 of Part I absolutely 
essential to an understanding of Part II, however. In Part II we present 
applications of the mathematical framework to certain experimental 
problems, and there the psychologist will probably find material of 
Most interest to him. We recommend that he read Chapter 9 on 
statistical estimation before reading later chapters, but Chapters 10, 11, 
12, 13, and 14 are essentially independent of one another. 

The psychologist who plans to apply the mathematical analysis to 
data of his own may find it necessary to read much more of the book. 
We do not consider the starred sections necessary nor do we think that 
Chapters 2, 4, and 6 are absolutely essential for this purpose. But 
beyond this we have few omissions to recommend. 

Those mathematicians who wish to examine the contents of this 
book may be mainly interested in Part I. The major mathematical 
Problems arise in Chapters 4 and 6. Statisticians, on the other hand, 
may find more interest in Part II, since we consider a number of new 
problems in estimation. Our handling of these problems follows rather 
Conventional lines, but the statements of the problems and the diffi- 
culties encountered may attract the statistician. 

The work we describe in this book began in the summer of 1949 
with an attempt by one of us (F. M.) to analyze some data on the 
reports of hospital patients on the analgesic effects of certain drugs. 
It was postulated that a learning effect was superimposed upon the 
biological effects of the drugs. These data were ultimately analyzed 
with a quite different model, but the original model suggested to us 
the possibility of an “operator” model for handling data from learning 
experiments. As we became more and more aware of the large bulk 
of empirical information on learning which was available, the model 
was reformulated and modified many times. Independent of our initial 
attempts, William K. Estes at Indiana University developed a model 
of conditioning based upon the principles of association theory. We 
have been much influenced by Estes’ work and have adopted many of 


x PREFACE 


his ideas from time to time. Also independent of our early work, George 
A. Miller when at Harvard University and at the Institute for Advanced 
Study investigated some of the sequential properties of behavior. This 
work too has influenced ours. Perhaps the period of greatest progress 
in our research occurred during the summer of 1951, when we had 
the privilege of working for two months with Cletus J. Burke, William 
K. Estes, George A. Miller, David Zeaman, William J. McGill, Kather- 
ine S. Harris, and Jane E. Beggs. This two-month seminar, entitled 
Mathematical Models for Behavior Theory, was made possible by the 
Social Science Research Council's program of Inter-University Sum- 
mer Research Seminars supported by a grant from the John and Mary 
R. Markle Foundation. Tufts College, acting as host to the group, 
kindly made space and facilities available. Further progress was made, 
particularly on mathematical problems, during the summer of 1952, 
when we participated in a University of Michigan conference at Santa 
Monica, California. That conference was part of a program sponsored 
by the Ford Foundation, and it was conducted in cooperation with 
the RAND Corporation in Santa Monica. Persons especially helpful 
to us at that conference were Richard Bellman, Clyde H. Coombs, 
Robert L. Davis, William K. Estes, Merrill M. Flood, Theodore E. 
Harris, Samuel Karlin, Tjalling C. Koopmans, Roy Radner, Harold N. 
Shapiro, Gerald L. Thompson, and Robert M. Thrall. Dr. Flood has 
had continued interest in these problems and has developed several 
models related to ours. 

From the beginning, the work which has resulted in this book was 
facilitated by support from the Laboratory of Social Relations at Har- 
vard University. During the first two years, one of us (R. R. B.) was 
a recipient of a Fellowship in the Natural and Social Sciences awarded 
jointly by the National Research Council and the Social Science Re- 
search Council During the several years of research we have had 
mary useful suggestions and criticisms from our colleagues at Harvard 
and. elsewhere. During the early stages of this work, William O. 
Jenkins, now at the University of Tennessee, spent many hours read- 
ing and criticizing our work, making us aware of numerous experi- 
mental problems and, more generally, giving us an orientation towards 
experimental problems of interest to psychologists. We express our 
sincerest thanks to Dr. Jenkins for this guidance. Suggestions on 
mathematical and statistical questions were made from time to time 
by John W. Tukey and Allan Birnbaum. Joseph Weizenbaum assisted 
us in the early stages of the work on Chapter 13. We are also grateful 
to Lotte Bailyn, David G. Hays, Solomon Weinstock, and Thurlow R. 


PREFACE xi 


Wilson for suggestions about revising the manuscript to make it more 
readable. George A. Miller critically read large portions of the semi- 
final draft and made numerous suggestions for the improvement of 
content and exposition. Doris Entwisle and Cleo Youtz prepared early 
drafts of the manuscript, carried out numerous computations, and made 
frequent helpful criticisms. The semifinal and final copies of the manu- 
script were prepared by Vernon L. Schonert. 

In every sense, this book and the research. which preceded it were 
joint efforts of the two authors. Neither of us alone would have had 
the patience to carry the work to completion. Furthermore, the fre- 
quent demolishment of each other's ideas has kept the book to a 


reasonable size. 
RonERT R. BUSH 


FREDERICK MOSTELLER 


Harvard University 
February, 1955 


Contents 


1 
INTRODUCTION 
MODEL 
PART I THE MATHEMATICAL SYSTEM AND THE GENERAL o 
CHAPTER 1 THE Basic MODEL " 
1.1 Introductory comments - 5 
1.2 Response classes and probabilities. " d 
1.3 — Factors which change the probabilities is 
1.4 The concept of an operator — 
l.5 Matrix operators B "e 
1.6 Two alternatives or response classes 3 
1.7 Restrictions on the parameters e" itis aT 
*1.8 Generalization to r alternatives; combining classes con! s 
l9. Summary ds 
References 7 
CHAPTER 2 STIMULUS SAMPLING AND CONDITIONING e 
2.1 A set-theoretic approach . be 
2.2 Subsets and their combinations. 30 
2.3 Probability and stimulus sampling = 
2.4 Deduction of the operators : -— 
42.5 Extensió to r response classes; homogeneity and combining " 
classes s 
2.6 Summary 54 
References » 
CHAPTER 3 SEQUENCES OF EVENTS " 
3.1 Introduction 56 
3.2 The structure of some elementary nd. 58 
3.3  Repetitive application of a single ope i € 
*3.4 Alternative approaches T 82 
: i i ator 
*3.5 Repetitive application of matrix d Ss 64 
3.6  Commutativity of the operators, E a St sad T di 
*37  Commutativity of the matrix opera ors T, 2 e 
3.8 The systematic sequence (Q."Q,' p 
i -controlled events - E 
+“ i aic oiei events with t operators and r alternati r 
3. xperim - | 
3.11 Simple Markov chains 76 
3.12. Subject-controlled events 
3.13 Experimenter-subject-controlled events 
3.14 Summary 82 
References 


xiii 


xiv 


CONTENTS 


CHAPTER 4 DISTRIBUTIONS OF RESPONSE PROBABILITIES 


4.1 Introduction 
4.2 Definition of moments 
4.3 Moments for two experimenter-controlled events 
*4.4 Moments for z experimenter-controlled events 
4.5 Moments for two subject-controlled events 
4.6 Moments for experimenter-subject-controlled events 
4.7 Theorems about the p-value distributions ` 
4.8 Length of runs 
4.99 Summary 
References 
CHAPTER 5 THE EQUAL ALPHA CONDITION 
5.1 Introduction 
5.2 Implications in the set-theoretic model 
5.3 Experimenter-controlled events 
5.4 The distribution means for two subject-controlled events 
5.5 The distribution variances for two subject-controlled events 
5.6 Subject-controlled events with 1, = 1 and ` = 0 
5.7 Subject-controlled events with A, = 0 and ^, 1 
5.8 Experimenter-subject-controlled events 
5.9 Experimenter-subject-controlled events with limits zero and unity 
*5.10 Extension to r responses and s outcomes 
*5.11 Markov sequences of experimenter-controlled events 
5.12. Summary 
Reference 


CHAPTER 6 APPROXIMATE METHODS 


6.1 Introduction 

6.2. Stat-rats 

6.3 The asymptotic distributions 

6.4 The expected operator 
*6.5 Bounds on the asymptotic mean 
*6.6 Improved bounds on the asymptotic mean 
*6.7 Bounds on the pre-asymptotic means 

6.8 | Summary 

References 


CHAPTER 7 OPERATORS WITH LIMITS ZERO AND UNITY 


7.1 Introduction 
7.2 The asymptotic distribution for Case I 
7.3 Bounds on the asymptotic mean for Case I 
7.4 Further restrictions on Case I 
*7.5 The functional equation for Case I 
7.6 The asymptotic distribution for Case II 
7.7 Cases with only one absorbing barrier 
*7.8 The asymptotic distribution for one absorbing barrier 
7.9 | Summary 
References 


83 
83 
85 
88 
91 
93 
97 
98 
100 
105 
105 


106 
106 
108 
108 
110 
113 
115 
116 
118 
119 
121 
124 
127 
127 


128 


128 
129 
131 
138 
141 
148 
149 
150 
152 


153 
153 
155 
156 
162 
163 
164 
166 
167 
169 
170 


CONTENTS 


CHAPTER 8 COMMUTING OPERATORS 
8.1 The commutativity conditions 
82 Equal limit points of the operators 
8.3 The first occurrence of response 4; 
8.4 The second occurrence of response 4f, 
8.5 Cases with one identity operator 
8.6 Experimenter-subject-controlled evi 
8.7 Summary 
References 


ents with identity operators 


PART ll APPLICATIONS 


CHAPTER 9 IDENTIFICATION AND ESTIMATION 
9.1 The identification problem 
9.2 Reinforcement theory versus contigui 
9.3 Experimental variables 
9.4 The estimation problem 
9.5 Monte Carlo checks on estimates 
3 9.6 Simple statistics of the data 
9.7 Bias and variance of estimators 
9.8 Maximum likelihood estimators 
9.9 A special maximum likelihood problem 
9.10 Procedures for computing ü ` 
9.11 Procedure for computing â and go 
9.12. Variance of the estimate îĉ 
9.13 Goodness-of-fit considerations 
9.14 Summary 
References 


ty theory 


| CHAPTER 10 FREE-RECALL VERBAL LEARNING 
i 10.1 The experiments 

| 10.2 Identifications and assumptions 
10.3 The data 

10.4 Estimation of Po 

10.5 Estimation of « 

10.6 Simultaneous estimation © 
10.7 Imperfect learning 

10.8 Evaluation of the model 
10.9 | Summary 

References 


f qo and d, 


CHAPTER 11 AVOIDANCE TRAINING 


11.1 Introduction 
11.2 The Solomon-Wynn 
11.3 The model 
11.4 Estimation of 
1L5 Estimation of the avoida 
11.6 Goodness-of-fit 

, 11.7 A theoretical interpretation _ 
11.8 Experiments on the CS-US interval 
11.9 — Summary 
References 


e experiment 


the shock parameter, &» 
nce parameter, «i 


XV 


171 
171 
173 
177 
179 
180 
183 
184 
184 


187 
187 
189 
191 
192 
194 
195 
199 
200 
203 
208 
210 
210 
212 
216 
216 


217 
217 
218 
219 
222 
225 
228 
230 
234 
236 
236 


237 
237 
238 
240 
241 
245 
251 
253, 
256 
257 
258 


xvi 


CONTENTS 


CHAPTER 12 AN EXPERIMENT ON IMITATION 


12.1 The experiment 

12.2 The model 

12.3 The data 

12.4 Estimation of a, and a, 
12.5 Estimation of p, 

12.6 Goodness-of-fit 

12.7 Summary 

References 


CHAPTER 13 SYMMETRIC CHOICE PROBLEMS 


13.1 Introduction 

13.2  T-maze experiments 

13:3 Experiments with human subjects 

13.4 A model with experimenter-controlled events 

13:5 Data from experiments using the non-contingent procedure 

13.6 A model with experimenter-subject-controlled events 

13.7 The Stanley T-maze data 

13.8 | Human experiments using the contingent procedure 
*13.9  Three-choice experiments 

13.10 Extinction data 

13.11 Comparisons and evaluations 

13.12. Summary 

References 


CHAPTER 14 RUNWAY EXPERIMENTS 


14.1 The experiments 

14.2 Identification problems 

14.3 A model with discrete times 

14.4 A model with continuous times 

14.5 Estimation of parameters of the asymptotic distribution 
14.6 Analysis of the Weinstock data 

14.7 Concluding remarks 

14.8 Summary 

References 


CHAPTER 15 EVALUATIONS 


15.1 


Purpose of this chapter 


15.2 Measures of behavior 

15.3 Basic assumptions 

15.4 Mathematical and statistical problems 
15.5 Experimental problems 

15.6 Theoretical interpretations 

15.7 Concluding remarks 

References 


TABLE A THE FUNCTIONS (4,8) AND ¥(a,8) 
TABLE B THE FUNCTION T(a,8) 

TABLE C THE FUNCTION g,(a) 

TaBLE D THE FUNCTIONS F(a) AND G(a,8,2) 
GLOSSARY OF SYMBOLS FREQUENTLY USED 


INDEX 


259 
259 
260 
262 
264 
266 
270 
272 
273 


274 
274 
274 
276 
279 
282 
286 
291 
294 
300 
303 
306 
308 
309 


310 
310 
311 
312 
315 
319 
321 
327 
328 
328 


329 
329 
329 
330 
331 
332 
332 
333 
337 


339 
344 
346 
347 
357 
359 


Introduction 


In the construction of mathematical models there are usually three 
main steps or levels: (1) the mathematical system, (2) the identifications 
or coordinating definitions, and (3) the specific applications. 

First there must be a mathematical theory or system. Empirical 
phenomena, experimental quantities and variables, are not properly 
discussed within the mathematical system. The elements of the system 
are not operationally defined, for they are abstract concepts which acquire 
ationships with one another. Nevertheless, 
the empirical phenomena for which one hopes to build a model may 
Suggest an appropriate mathematical system to use as a framework. We 
do not adhere to a strictly formal position, but rather we label the elements 
of our mathematical system so as to suggest possible identifications between 
them and observable quantities. Our mathematical system is concerned 
with classes of responses and events, for example. Since we believe 
that behavior is a statistical phenomenon, from a macroscopic point of 
view at least, we attempt to describe response tendencies by sets of 
probability variables. We introduce certain types of mathematical 
operators to correspond to the events which alter response tendencies 
during learning. The main outline of the mathematical system is given 


in Chapter 1. 

The second step in the develo 
state the general correspondence 
system and empirical phenomena. 


practically done that for our model. 
To make the mathematical system into a general mathematical model for 


learning, we say that the responses in the system correspond to responses 
of organisms, that the events in the system correspond to the events in the 
real world, and that the response tendencies of organisms correspond to 
the sets of probability variables in the system. Once such identifications 
are set up we have a general mathematical model for learning. Some 
would call it a general theory of learning, but we try to avoid the word 
“theory” because to psychologists our model will seem very different from 
the more classical learning theories. In several places in this book we 
suggest correspondences between our model and psychological theories. 
1 


meaning only through their rel 


pment of a mathematical model is to 
between elements in the mathematical 
In the last paragraph we have 
We will spell it out a little further. 


2 STOCHASTIC MODELS FOR LEARNING 


The third step is to make specific applications of the general model to 
actual situations. These applications might be regarded as specific 
models. The general model then generates many specific models. For 
example, the general gravitational model has specific applications to 
celestial bodies, to bodies on inclined planes, and to bodies falling near 
the earth. These are specific models that flow from a more general 
model. Similarly a general model of heat flow could be applied to describe 
temperature distributions in long rods, in thin sheets, in solid blocks, or 
under deserts. These specific models may or may not approximate 
empirical facts. Similarly, the general model developed in this book can 
be applied, for example, to describe behavior in rote learning, in avoidance 
training experiments, and in choice situations with risk. 

Only after the identifications are made in specific situations is it 
relatively easy to demonstrate that a specific model is inadequate in the 
sense that observed results do not agree with the forecasts. Close agree- 
ment does not prove that the model is correct but suggests that it may be 
useful; poor agreement indicates that the specific model, including the 
identifications, is inappropriate. When similar identifications are made 
in many different situations and good results are achieved, we begin to 
feel that the method of identification is reasonable and that the mathe- 
matical system was wisely chosen. Poor agreement between fact and 
prediction does not demonstrate that the mathematical system is un- 
satisfactory, however, for the same mathematics with new identifications 
may give quite satisfactory results. It would not surprise us if at a later 
date someone were to claim that we had made poor choices of identi- 
fications in our analyses of experimental data, and then proceed to 
re-analyze the data with essentially the same mathematics but new 
identifications. Indeed, it is next to impossible to prove that a mathe- 
matical system cannot lead to useful models, for only after repeated 
failures with different sets of identifications would we have any evidence 
that a particular mathematical system was unsuitable. One mathematical 
system can lead to many models, and only the models can be judged as 
adequate or inadequate. - 

The preceding paragraphs try to distinguish among a mathematical 
system, a general model, and a specific model. The reader who is still 
confused about these distinctions should not be disturbed. Time spent 
arguing over such arbitrary categorizations contributes little to progress 
in science; such arguing is best done on convivial rather than professional 
occasions. 

We wish to make a point or two about the relation of our model to 
mathematical statistics and probability. For the most part the data 
obtained from learning experiments have been analyzed in the past by 


din 


"m cum I S 


"Xu le 


INTRODUCTION 3 


standard statistical techniques. In many cases a graphical presentation 
is suflicient to demonstrate a point being made by the investigator, 
whereas in other cases something like a conventional -test is used to 
compare two groups of subjects. These techniques have been useful and 
will continue to be. But statisticians have not developed special tech- 
niques for the special problems that arise from learning data. Nothing 
comparable to the statistical methodologies developed for sampling 
inspection, bioassay, or epidemiology is available to the experimental 
psychologist who wants to study learning and memory. 

Data on animal and human learning present peculiar problems to the 
statistician: since irreversible changes take place while the data are being 
collected, repeated sampling is seldom possible. Organisms that can be 
considered “identical” at the start of an experiment do not remain com- 
pletely “identical” because each has a different history during the course 
of the experiment. Observations such as these often throw doubts on 
the routine application. of standard statistical procedures. More im- 
portant, they suggest that, if methods specifically designed for handling 
these data were available, considerable gains in efliciency and meaningful- 
ness would obtain. The specific models presented in this book are so 
designed. The mathematical developments in the first eight chapters 
are for the most part topics in the field of probability, or stochastic 
processes. We use the word "stochastic" to put some emphasis on the 
temporal nature of the probability problems we consider. Later, when 
estimation procedures are developed, we are working in the field of 
mathematical statistics. Finally, when we identify the mathematical 
components with experimental components we are describing specific 
mathematical models for learning. ; 

In general terms we now define what we mean by learning and state our 
fundamental view of the learning process. We consider any systematic 
change in behavior to be learning whether or not the change is adaptive, 
desirable for certain purposes, OF in accordance with any other such 
criteria, We consider learning to be “complete” when certain kinds of 
stability—not necessarily stereotypy—obtain. After we have described 
our model we can make these notions more explicit, but we wish to stress 
at the outset the generality with which we use the term learning. We do 
not take a position of strict determinism with respect to behavior and its 
prediction. We tend to believe that behavior Is intrinsically probabilistic, 
although such an assumption is not a necessary part of our model. 
Whether behavior is statistical by its very nature or whetheritappears to beso 
because of uncontrolled or uncontrollableconditions does notreallymatterto 
us. Ineither case we would hold that a probability model is appropriate 
for describing a variety of experimental results presently available. 


4 STOCHASTIC MODELS FOR LEARNING 


In order to illustrate the type of approach we follow in this book, we 
now present two simple examples of the way we later handle data from a 
learning experiment. 

FIRST EXAMPLE. For the first illustration we consider reward training 
of rats in the simplest type of T-maze experiment. (In Chapter 13 this 
problem is treated extensively.) We have an elevated T-maze with boxes 
at the two ends of the T. On each experimental trial a hungry rat is 
placed at the starting position (base of the T) and is allowed to run down 
the maze to the choice point, where it then turns right or left. On every 
trial a pellet of food is placed in the box on the right side of the maze; 
food is never placed in the box on the left side. Retracing is not allowed, 
and so the rat may not obtain food on some trials. A rat runs through 
many trials and eventually learns to turn right at the choice point. 
Dozens of aspects of the rat's behavior can be observed and measured, 
but we are concerned here with only one: whether it turned right or left 
on each trial. The data we consider then for a single rat are a sequence 
of right and left turns, and our problem is to analyze this sequence. We 
may be lucky enough to have such data for a large group of rats which 
we are willing to assume are identical. 

A model for these data can now be constructed by making several 
abstractions. On each trial the rat must go either right or left, and so 
we say there is a probability p, that the rat goes right on trial n, where n 
has the values 1, 2, 3, :-. For this exposition, we assume that none of 
the rats has an initial position preference, and so we let p; = 0.5. We 
know that later in the sequence the rat is more likely to go right than left, 
and, in fact, after a sufficiently great number of trials, the rat is almost 
certain to go right. Hence we want p, to be very near 1.00 when n is 
some large number (say 300). 

The question now arises: What causes p, to increase from 0.5 for 
n= | to 1.00 for n very large? Reasonably enough, we say that the 
increase is a consequence of what happens to the rat in the maze. Con- 
sider the first trial. The rat either goes right and finds food, or it goes 
left and does not find food. If the rat goes right, the reward should 
increase the probability of its going right on the second trial, that is, ps 
should be greater than 0.5. On the other hand, if the rat goes left, the 
absence of reward should decrease the probability of its going left again, 
and so in this case also the probability p; of going right on trial 2 should 
be greater than 0.5. 

For this introductory exposition we assume that the increase in prob- 
ability on the first trial is some fraction, say one-tenth, of the maximum 
possible increase. The maximum possible increase is 1.0 — 0.5 or 0.5, 
and so we take pg — 0.5 + 0.1(1.0 — 0.5) = 0.55. We assume here 


INTRODUCTION 5 


that the effect of a reward on the right side is the same as the effect ofa 
non-reward on the left side. This is a very special assumption, which is 
made here only for simplicity and is not made in most of the book for 
problems of this sort. These assumptions are extended to all trials; on 
any trial n we add to the probability p, an amount which is one-tenth of 
| —p, to obtain p, Thus, ps = 0.55 + (0.1) (1 — 0.55) = 0.595, 
and p, = 0.595 + 0.1(1 — 0.595) — 0.6355, and soon. We can compute 
the theoretical probabilities on as many trials as we like. In Table 1 we 


TABLE | 
Probabilities p, for the first 20 trials for T-maze example. 

Trial number 7 Probability pn Trial number 7 Probability pn 
1 0.5000 11 0.8257 
2 0.5500 12 0.8431 
3 0.5950 13 0.8588 
4 0.6355 14 0.8729 
5 0.6720 15 0.8856 
6 0.7048 16 0.8971 
T 0.7343 17 0.9073 
8 0.7609 18 0.9166 
9 0.7848 19 0.9250 

10 0.8063 20 0.9325 


present the first twenty probabilities. This highly simplified model is 
now complete, and the problem that remains is to test the model. 

The obvious way to test the model is to collect data on a large number 
of nearly identical rats, say 100. The proportion of rats that turn right 
on any trial n is an estimate of the probability p,. For the first trial, 
pi = 0.5, and so running 100 rats on this trial is analogous to flipping 
100 true coins; the proportion of heads obtained is an estimate of the 
true probability of getting a head. This proportion is likely to be quite 
near 0.5. Again on the second trial we observe the proportion of rats 
that go right, and this proportion gives us an estimate of py. We continue 
in this way and obtain estimates of p, for n from 1 to whatever total 
number of trials we have had the patience to run In the experiment. 
Having obtained these experimental estimates of the probabilities p, we 
can compare them with the values of p, computed from the model above. 
All that remains is to measure the goodness-of-fit of the model to the data. 

If we were to analyze actual data from the T-maze experiment described 
above, we would likely find that the model was a poor predictor, unless 


6 STOCHASTIC MODELS FOR LEARNING 


we were very lucky. How then do we patch up the model? One ob- 
viously arbitrary assumption is that we always add one-tenth of 1 — p, to 
Pn tO get Pry. This quantity 0.1 describes the rapidity of learning, and 
perhaps it is too small for the rats used in the experiment. We might 
try 0.2 or 0.3 and recompute the theoretical probabilities. More generally 
we could try many such values until we got good agreement with the data, 
or at least the best possible agreement. Fortunately this trial-and-error 
procedure is not necessary. We do some algebra instead. We let this 
fraction be a, and we compute p,, ps. '* in terms of a. The problem is 
to choose the value of a that yields the best fit to the data. In the language 
of mathematical statistics, we need to estimate the parameter a from the 
data. 

We might inquire into further possible generalizations of the model just 
described. Even the one generalization already introduced—letting a be 
determined from the data—may not be sufficient to account for the data. 
Wecould estimate the probability p, for the first trial from the data instead of 
assuming it to be 0.5, and this procedure might lead to better agreement 
between model and data because real animals often have position prefer- 
ences. Beyond this, we could give up the assumption that reward on the 
right side causes exactly the same increase in probability of turning right 
as does non-reward on the left side. The value of the parameter a might 
be different for these two events. But it is just at this point that the model 
becomes much more complicated. For example, the probability p 
depends on what the rat actually did on trial 1, and ps depends on what 
the rat did on trials 1 and 2. Different rats will produce different sequences 
of right and left turns, usually at least, and so different rats will have 
different probabilities on trial 3, for example. For a large population of 
rats we shall have a distribution of probabilities, not just a single probability 
Pn On each trial. In most of this book we are concerned with such 
distributions, their properties from trial to trial, and with methods of 
estimating parameters of these distributions from experimental data. 

SECOND EXAMPLE. We now present a second example of an experi- 
mental problem in learning and indicate how we construct a model and 
analyze the data. The experiment is one reported by Solomon and 
Wynne on the avoidance training of dogs. In Chapter 11 we analyze 
this experiment in more detail after we have developed our mathematical 
system and the general model. In this introduction we describe only 
the specific model for this problem and indicate one way the data can be 
analyzed. 

The Solomon-Wynne experiment used an intense electric shock from 
which a dog could escape by jumping over a barrier. A conditioned 
stimulus preceded the onset of shock by 10 seconds. The dogs learned 


ee ee ee NORMEN ee eee 


INTRODUCTION 7 


to avoid the electric shock by responding to this conditioned stimulus. 
The data which concern us here are simply whether each dog on each trial 
avoided or received shock. Solomon and Wynne observed a sequence of 
shocks and avoidances for each dog. These sequences eventually con- 
tained all avoidances, that is, the dogs learned to avoid with certainty. 

Many statistics of the sequential data may be computed, and several of 
these are given by Solomon and Wynne. Examples are the mean number 
of trials before the first avoidance (mean for 30 dogs), mean number of 
trials before the second avoidance, mean number of shocks received 
during all acquisition trials, etc. Such statistics may be computed 
directly from the data without the use of a model. Moreover, when 


certain experimental conditions are varied, such as the intensity of the 
t of shock, 


shock or the time between the conditioned stimulus and the onse 
these statistics may be obtained for each such condition. The only 
ins is to interpret these statistics, that is, to infer what 


problem that rem: 
The following model 


they mean in conditioning and learning terms. 
provides one way to interpret the data. 

We assume that on each trial n (1 = 1, 2, 3, °° +) there is a true prob- 
ability q, that a particular dog will receive shock, that is, that it will not 
avoid shock. Since the dogs in fact do learn to 
must decrease as n gets large. What causes 
hat the changes in the q, are a conse- 
rience. In particular, if a dog avoids 
bility that it will be shocked on the 
We require that %, be between 
but remains positive. Similarly 


jump soon enough to 
avoid, the probabilities q,, 
the q, to decrease? We assume t 
quence of the animals’ previous expe 
the shock on trial zz, then the proba 
next trial is some constant % times qn- 


zero and unity so that 44, is less than q,, 
we assume that if the dog receives a shock on trial m, then q, is multiplied 


by some other constant 2» to give the probability of shock on the next 
trial, Again x, must be between zero and unity. Thus, on each trial 
we multiply the probability of shock on that trial by either , or %g to 
Obtain the probability of shock on the next trial. 

We make one further assumption which is strongly supported by the 
data of Solomon and Wynne. We assume that the probability qı of 
shock on the first trial is unity, that is, that each dog is certain to be 
shocked on the first trial. By our previous assumption, then, the prob- 
ability gẹ of shock on the second trial is gg, OF just 2s. Now on this 
second trial a dog may either avoid or be shocked. If the dog avoids on 
the second trial, the probability qa that it will be shocked on the third 
trial is 2445 OF 2419: if the dog is shocked on the second trial, the prob- 
ability g, that it will be shocked on the third is #9» OF 2". Proceeding 
in this way, we may compute the probability of shock qẹ„ on any trial n if 
we know how many previous shocks or avoidances the dog had. If we 


8 STOCHASTIC MODELS FOR LEARNING 


denote the number of previous shocks by j and the number of previous 
avoidances by k, we have q, = «s«,". Previous to trial n there will have 
been n — 1 trials and so j + k =n — 1. With this information we can 
compute various statistics of the data in terms of the two parameters x, 
and a». For example, we could compute the expected fraction of dogs 
that are shocked on each trial. This fraction would tend to zero as the 
number of trials becomes large because both «, and o, are assumed to be 
less than one. This conclusion is required by the data; indeed, we have 
constructed the model so that after many trials avoidance would almost 
always occur. 

The main problem which faces us is how to estimate œ and % from the 
data in order to get a close fit between the data and the model. In Chapter 
11 we discuss several ways of estimating these two parameters, but here 
we describe only one way. This procedure is not very efficient, but it is 
adequate for the present illustration. Consider first a statistic mentioned 
earlier, the mean number of trials before the first avoidance. We denote 
this statistic by F}. This number F, depends only on our parameter a, 
and not upon %. We see this as follows. Prior to the first avoidance, a 
dog is shocked on every trial, and so its probability of shock on each of 
these trials is x, to some power j, where j is the number of previous shocks. 
The parameter a, is not involved until after the first avoidance. The 
statistic F, then, depends only on a». In Chapter 8 this function is 
derived, but here we provide only a table based upon that derivation. 
In Table 2 we show some computed values of F, for different values of x». 
(Table A at the end of the book is more extensive.) We enter Table 2 
with the statistic F, which can be computed directly from the data, and 


TABLE 2 
Mean number of trials, £s before the first avoidance for different values of x, 

X9 F, E. n 
0.81 2.81 0.91 4.13 
0.82 2.88 0.92 4.39 
0.83 2.97 0.93 4.70 
0.84 3.07 0.94 5.08 
0.85 3.17 0.95 5.57 
0.86 3.29 0.96 6.23 
0.87 3.42 0.97 721 
0.88 3.56 0.98 8.84 
0.89 3.73 0.99 12.52 
0.90 3.9] 1.00 infinity 


INTRODUCTION 9 


so obtain an estimate of x,. In the Solomon-Wynne experiment, the 
mean number of shocks before the first avoidance was F = 4.50. From 
Table 2, therefore, we see that x, is approximately 0.92. 

Now consider another statistic, 72, defined as the mean total number of 
Shocks. This statistic 74 will be a function of both a, and æ in our model, 
because all the data are involved in obtaining Tą. Again we defer the 
derivation of the function until Chapter 8. However in Table 3 we give 
computed values of 7, for different values of o, and «p. (Table B is more 
extensive.) 

Because T, is obtained directly from the experimental data, and since a, 
has already been estimated, Table 3 can be used to estimate x}. From the 


TABLE 3 


Mean total number of shocks, fs. for different values of «, and «s. 


sj 
0.90 | 0.91 | 0.92 | 0.93 | 0.94 0.95 | 0.96 | 0.97 | 0.98 | 0.99 


0.75 | 6.11 | 6.39] 6.70 | 7.07 | 7.51 | 8.05 8.73 | 9.64 | 10.98 | 13.41 
0.76 | 6.25 | 6.54 | 6.87| 7.25| 7.70 | 8.26 8.96 | 9.90 | 11.30 | 13.82 
0.77 | 6.41 | 6.70 | 7.04 | 7.44 | 7.90 | 8.48 9.21 | 10.18 | 11.63 | 14.25 
0.78 | 6.57 | 6.88 | 7.23] 7.63| 8.12 8.72 | 9.47 | 10.49 | 11.99 | 14.72 
0.79 | 6.75 | 7.06 | 7.42] 7.85 | 8.35 8.97 | 9.75 | 10.81 | 12.38 | 15.22 
0.80] 6.93 | 7.26] 7.64| 8.08 | 8.60 9.24 | 10.06 | 11.16 | 12.79 | 15.77 
0.81 | 7.13 | 747| 7.86| 8.32 8.87 | 9.54 | 10.39 | 11.54 | 13.24 | 16.36 
0.82] 7.35] 7.70] 8.11 | 8.59 9.16 | 9.85 | 10.74 | 11.95 | 13.73 | 17.00 
0.83] 758| 795| 838| 8.87] 9.47 | 10.20 | 11.13 | 12.39 | 14.27 | 17.71 
0.84/ 783| 822| 866| 9.19| 9.81 | 10.57 | 11.55 | 12.88 | 14.85 | 18.48 
0.85| S11] 851| 8.98] 9.53 | 10.18 | 10.99 12.02 | 13.41 | 15.50 | 19.34 
0.86 | 8.41 | 8.84] 9.33| 9.90 | 10.59 11.44 | 12.53 | 14.00 | 16.22 | 20.30 
0.87| 8.75| 9.19 | 9.71 | 10.32 | 11.05 11.94 | 13.10 | 14.66 | 17.02 | 21.38 
0.88| 9.12] 9.59 | 10.14 | 10.78 | 11.55 12.51 | 13.73 | 15.40 | 17.92 | 22.59 
0.89 | 9.53 | 10.03 | 10.62 | 11.30 | 12.12 13.14 | 14.45 | 16.24 | 18.94 | 23.98 
0.90 | 10.00 | 10.54 | 11.16 | 11.89 | 12.77 13.86 | 15.27 | 17.20 | 20.12 | 25.58 


al number of shocks is T, = 7.80; 
— 0.92. Hence we look down the 
— 0.92 until we find the number 
= 0.81, and so this is our 


Solomon-Wynne data the mean tot 
we have just obtained the value xs 
column in Table 3 corresponding to %2 
closest to 7.80. This appears in the row for a 
estimate of the parameter %. 


The estimation procedures just described are by no means the best 


10 STOCHASTIC MODELS FOR LEARNING 


possible, as they have quite large sampling errors. However, we have 
made our point: the parameters a, and «g can be estimated from the data, 
in this case from the statistics F and 7,. The question is whether o, 
and æ, are more meaningful than F, and Ts. Our answer is in two parts. 
The first part has to do with the purpose of the model; in terms of two 
parameters, «; and g, the model predicts properties of the entire sequence 
of behavior and these predictions are testable. Numerous other statistics 
of the data can be computed from the obtained estimates of 2, and æ and 
the model. In this sense, % and % are more useful than F and T 

The second part of our answer to the question about the meaningfulness 
of «, and « has to do with the meaning of these parameters when they 
were first introduced in the model. Consider æ}; we asserted that if 
avoidance occurs on trial n, the probability of shock on trial n + 1 is 
Init 79,4. The parameter ay, therefore, is a measure of the “in- 
effectiveness" of an avoidance trial in reducing the probability of shock. 
The nearer o is to unity, the less effective, and the nearer « is to zero the 
more effective is one avoidance trial. Similarly, x; is a measure of the 
ineffectiveness of a shock trial in reducing the probability of shock. 

We estimated that « = 0.81 and a, = 0.92 for the Solomon-Wynne 
data. Hence, in that experiment, an avoidance trial has a greater effect 
than a shock trial. We can compare the effectiveness of shock and avoid- 
ance trials by finding out how many shock trials have an effect equivalent 
to one avoidance trial. We just need to know how many times to multiply 
0.92 by itself to get 0.81. It turns out that (0.92)*? is approximately 0.81, 
and so we infer that an avoidance trial has the same effect as 2.5 shock 
trials. Without a model, an inference such as this could not be made 
readily. The parameters vı and æ may change, of course, when experi- 
mental conditions are varied. In Chapter 11 we discuss estimates of these 
parameters for further data obtained by Solomon and his colleagues. 

The foregoing two examples illustrate the general strategy we follow in 
applying our mathematical system to particular situations. In order to 
be as intelligible as possible to a non-mathematical reader, we here sup- 
pressed the mathematics and emphasized the problems of interpreting 
empirical data. The time has now come to make the mathematical 
structure explicit; we hope these two examples will provide guidance 
through mathematical developments that are sometimes unavoidably 
difficult and circuitous. 


PART I 


THE MATHEMATICAL 
SYSTEM AND THE 
GENERAL MODEL 


CHAPTER 1 


The Basic Model 


1.1 INTRODUCTORY COMMENTS 


This chapter presents the basic structure of the mathematical system 
used throughout this book. Descriptions of this mathematical system 
have already appeared in the literature [1, 2]. As each concept is intro- 
duced it is discussed both intuitively and formally. First we give intuitive 
arguments for the necessity or desirability of each concept for describing 
learning data and suggest possible identifications between the mathe- 
matical constructs and empirical quantities. As pointed out in the 
Introduction, specific identifications are not postulated until Part I, 
where the mathematical system is applied to various experiments. In 
addition to an intuitive introduction to each concept this chapter also 
contains a more formal description, a description that divorces 
the mathematical concept from experimental problems. The formal 
ate the possible generality of the mathematical 


description may indic 
ders quite different applications 


system and thus may suggest to some rea 
from those discussed in Part II. 


1.2 RESPONSE CLASSES AND PROBABILITIES 


In order to describe behavioral changes, we must distinguish among 
various kinds of responses. For even the simplest type of learning, such 
as bar-pressing by rats, we need two classes of responses, those which 
terminate in a relay closure indicating a bar press and those which do not. 
Bar-pressing and not-bar-pressing are taken to be mutually exclusive and 
exhaustive classes of responses. It is not necessary to identify all responses 
with overt motor activity—doing "nothing" is a response, and in the 
example just cited falls into the class of not-bar-pressing. Although most 


of our analysis will deal with just two response categories, we shall define 


in general r mutually exclusive and exhaustive classes of responses. An 
example of a situation in which more than two response classes are 


necessary is an experiment in which a human subject is asked to choose on 


each trial among several words or abstract symbols. 
13 


14 THE BASIC MODEL cu. | 


Some readers may object at the outset to our taking the response 
classes as mutually exclusive; they may feel that this is a serious limitation 
on our framework. For example, a rat may press a bar and flick its tail 

. simultaneously or a person may withdraw his hand from an electrified 
contact and say "ouch" at the same instant. Such considerations are 
inappropriate at this point, we feel, because we have not given the response 
classes explicit operational definitions, or, more precisely, we have not 
postulated identifications. We do take the position, however, that 
mutually exclusive classes of behavior can always be defined in any 
experimental problem. For instance, we might define one class as 
bar-pressing with a tail flick and another class as bar-pressing without a 
tail flick, and these two classes would be mutually exclusive. Or, perhaps 
more usefully, we could define a class as bar-pressing with or without 
tail flicks, that is, the union of the two classes just mentioned. In this 
example, the choice of definition would hinge upon what were being 
studied experimentally, bar presses, tail flicks, or both. In other words, 
mutually exclusive classes can always be found, but the serious question 
is whether or not such classes are useful. 

The requirement that the r response classes be exhaustive causes no 
difficulties, for a residual category—"'everything else"—can always be 
introduced. For example, in a runway experiment we may be focusing 
attention on a rat’s response of leaving a starting box and we may record 
only the time at which this response occurred. Prior to an occurrence of 
such a response the rat does many other things which we classify together 
as not-leaving-starting-box. 

In our mathematical system we represent the r response classes by a set 
of alternatives A}, Ag, +*+, A, The nature of these alternatives is not 
specified in the mathematical system, for they play a role analogous to 
that of points and lines in geometry. Later, when we apply our system 
to experimental problems we identify the alternatives A; with certain 
classes of behavior. Attimes we call them response classes in anticipation 
of the applications. 

As an index of behavior we have chosen a set of probabilities, p;, 
where j= 1l, 2, +++, r, one for each alternative or class of responses. 
In Part II we show how those probabilities can be related to experimental 
measures of behavior. We define p; as the probability that alternative 
A; will be chosen, for example, that a member of the jth class of responses 
will be performed on a trial. A trial is defined as an opportunity for 
choosing among the r alternatives. A trial so defined will correspond 
to an experimental trial in many problems, but in other problems in which 
time is an important variable a trial will be a short interval of time. 
More is said about these time problems in Chapter 14. 


SEC. 1.2 RESPONSE CLASSES AND PROBABILITIES 15 


Since the r alternatives are exhaustive and mutually exclusive, some 
alternative must occur on each trial. That is to say, we have 


THE PROBABILITY INVARIANCE RULE: 
é 

et) Att +p p=  0£5$sL 

je 
If the alternatives were not exhaustive the probabilities could add up to 
less than unity, and, if they were not mutually exclusive, they could add 
up to more than unity. If they were neither mutually exclusive nor 
exhaustive, the total could be any positive number. The probability 
invariance rule states that probability cannot be created or destroyed; 
the total probability is always the same on every trial. It can, however, be 
moved about from one alternative to another. Such a flow of probability 
from one class of responses to another is the basis for our description of 
learning. In fact, the most that a model derived from our mathematical 
system could predict is the set of r probabilities on every trial. Complete 
learning, as we use the expression, does not correspond necessarily to 
stereotyped behavior where one response always occurs. The final stable 
probability of a response might be 0.8, for example, and so that response 
would occur on the average 80 percent of the time. In other words, the 
completion of learning leads only to certain types of statistical stability. 
As an example of the behavior we might expect after complete learning, 
consider an experiment in which a person may (1) press a green button, 
(2) press a red button, or (3) press a white button. Let the probabilities 
of these responses stabilize at 0.6, 0.3. and 0.1, respectively. During an 
observation period of 100 trials after stability is reached, we would expect 
to find then that the person had pressed the green button about 60 times, 
the red button about 30 times, and the white about 10. 

The probability, p; that a jth class of responses will occur on a particular 
trial cannot be directly measured. We conceive that every organism 
possesses a "true" probability p; at the start of each trial.* As far as 
the mathematical system is concerned, the physical basis for this prob- 
ability is irrelevant. A variety of physiological models might provide it. 
Any sort of fictitious device that aids us to think about probabilities is as 
acceptable as any other. For example, we might imagine that the organism 
has a small disc that it can rotate. The disc is part shaded and part white, 
as shown in Fig. 1.1; the area of the shaded sector is proportional to p; 
and that of the white sector to 1 — pj. At the beginning of each trial, 
the organism whirls the disc. and, if a fixed marker points to the shaded 
part when the disc comes to rest, the jth response is made, and if it points 


* Several discourses on probability theory are available [3, 4]. 


16 THE BASIC MODEL CH. 1 


to white the jth response is not made. In other epe io vm p 
ame have the same chance of being po 

all sectors of the same area V ie S 

after one whirl of the disc. This principle T ie arg ossis 

2 Hd we a 
lasses as we choose. For example, : ¢ 
M robabilities 0.5, 0.3, 0.1, and 0.1, we have a disc as shown in 
eg As before, a decision is made by rotating the disc and observing 
which portion of the disc appears opposite the marker when the disc 
to rest. . . 

nm toli also think of the organism as possessing an urn of black and 
white balls with the proportion of black balls equal to pj. The organism 


V. ed "s. 


Fig. 1.1. Illustration of a disc which 
may be rotated for determining 
whether or not the jth response is to 


Fig. 1.2. A “decision-making disc" 
for the example of four response 
classes and probabilities of 0.5, 0.3, 


be made. The shaded area is pro- 0.1, and 0.1. 
portional to p; and the unshaded 
area to 1 — p,. The fixed marker is 


indicated by the arrow. 


draws from the urn to decide whe 
Another possible analogy is that 
table for making decisions. 
of course, either with the r 
However, we postulate that 
probability mechanisms. 


We are able to obtain estimates of the 
cases. 


ther or not to make the jth response. 
the organism uses a random number 
The above heuristic devices have little to do, 
eal world or with the mathematical system. 
organisms behave as if they possessed such 


probabilities in certain special 
If p; is constant, we may estimate p; from the proportion of trials 
in which an animal makes responses of class 4;. We are, in effect, 
estimating p; by sampling from a population of responses of a single 
animal. On the other hand, if we had a large number of animals, we 
could take as an estimate of the value of p; the proportion of animals 


SEC. 1.3 FACTORS WHICH CHANGE THE PROBABILITIES 17 


which made the jth response when the trial occurs. However, all these 
animals would have to be stochastically identical—by this we mean that 
each animal would necessarily have the same true probability p; at the 
time of measurement. 


13 FACTORS WHICH CHANGE THE PROBABILITIES 


Our next task is to specify what factors in the process change the set 
of probabilities. We begin with a very general view: whenever certain 
events occur, the probabilities are altered in a determined way. The 
nature of these events depends upon the experimental situation being 
considered, but in the mathematical system we need only an abstract set 
of t events, Ej, Ey,***, Er 

For purposes of exposition we may take a more restri 
assume that every time a response occurs it has an outcome. This out- 
come may be a reward given by an experimenter, it may be a change in 
the external environment, or it may be merely proprioceptive stimulation. 
We denote the possible outcomes by O, Os +++, Os Whatever the 
outcome, we assume that it has some effect (including zero effect) on the 
r probabilities associated with the r response classes. lt is our opinion that 
this assumption is not inconsistent with most current learning theories— 
reinforcement theorists emphasize the environmental events of reward 
and punishment, whereas the association theorists are concerned mainly 
with changes in external and internal stimulation. From our point of 


view these are all outcomes of trials. 
We assume furthermore that a particular outcome following a given 


response changes the set of r probabilities in a unique way which is 
independent of earlier events in the process In other words, if we are 
given a set of probabilities p; on trial n, the new set on trial n + 1 is 
completely determined by the response occurring on trial nand its outcome. 
(Earlier events will, of course, determine the p; on trial n.) This as- 
sumption we shall refer to as the independence-of-path assumption. Note 
that this assumption means that we are not concerned with the route or 

ath used to achieve the value of p;. The theory is ahistorical in the 
sense that the effects of past events can influence the future of the organism 
only if they have influenced its present state and are somehow embodied 
in the description of the present state. Just as the velocity of a freely 
falling body contains a “memory” of how far the body has fallen, the p; 
contain a memory" of the events that produced them. 

We have now arrived at the following position. A learning experiment 
consists of a sequence of trials, on each of which one and only one response 
occurs. Each response occurrence has an outcome which alters the 
probabilities of the various responses. We now need some mathematical 


cted view. We 


18 4 9 THE BASIC MODEL cH. | 


machinery to describe the effects of various outcomes on the set of r 
probabilities; this is the subject of the next section. 


1.4 THE CONCEPT OF AN OPERATOR 


All mathematical operations on a function f can be defined in 
terms of operators (see Davis [5]. A familiar operator is the differential 


1 . df. : d 
operator = ; the quantity g: is broken up into an operator — and an 
dx dx 


operand f. For a function f the operator = defines anew quantity Z. 
Whereas f (x) denotes the value of the function at x, < represents the rate 
of change of the value of the function. Other familiar examples of 
operators are log and sin. Two special operators are the identity operator 
which leaves every operand unchanged, and the null operator which 
causes every operand to become zero. In general, an operator O when 
applied to an operand x defines a new quantity Ox. Hence O represents a 
transformation on all values of æ. (We use the letter O because it is 
the first letter of the word Operator.) 

Even simpler operations of addition and multiplication may be defined 
in terms of operators, and we introduce these operators here. 
ator A when applied to a variable « will indicate the 
a (the symbol A is used to denote addition): 


DEFINITION OF A: 
(1.2) Ar -— x a. 
The notation Ax does not mean A multiplied by x. 
The meaning of the operator A may become clearer if we perform 


operation A on operand x more than once. The notation 42x will be 


used to denote the application of A to a quantity Ax. We then have the 
definition 


The oper- 
addition of a constant 


A®x = A(Ax). 
We note from the definition of A that an application of A to any 
x is identical to adding a quantity a to that operand. 


A(Ax) = (Ax) + a. 
But we know from the definition of A the value of Ax in terms of x, and 
so we obtain 
Ax = A(Ar) = (4x) --a = (t +a) Ea x + 2a. 


We may readily generalize the above to n applications of A to x and 
obtain 


operand 
Hence, we have 


(1.3) A"z = AAA+-- Ax =x + na. 


SEC. 1.4 THE CONCEPT OF AN OPERATOR ` 19 


Similarly, an operator M when applied to x will serve to indicate that 
x is to be multiplied by a constant n. 


DEFINITION OF M: 
(1.4) Mx = mx. 


Note that the left side of this equation does not mean “M times x” (it 
does mean “M operating on x”), but the right side does mean “‘m times c." 
When we apply M to x twice we obtain 


Mx = M(Mx) = m(Mz) = mmx) = nex, 
and when we apply M to x a total of n times, there results 
(1.5) Mx = m". 
The two operators, A and M, defined above, will be of considerable 
interest in our mathematical development, and so we shall here indicate 


what happens when both A and M are applied to an operand x. First, 
when we apply M to x and then apply A to Mx we get 


A(Mx) = (Mx) + a = mx +a. 


Second, when we apply A to « and then apply M to (42) we have 


M(Ax) = m(Av) = m(x + a) = mx + ma. 


We see at once that the order of application of M and 4 is important. 
Because the operation 4M yields a different result from the operation 
M Ax, the operators A and M are said to be non-commutative. If the 
order of application of a pair of operators can be reversed without affecting 
the result, the pair of operators is said to commute. The difference 


AMa — MAx = (mx + a) — (mt + ma) = a(l — m) 


is called the commutator of A and M. ] 

We may denote the operation 4M above by a new operator L applied 
to an operand a. We see that this leads to the most general linear function 
ofa. A linear function of 2 is, by definition, a constant plus the product 
of and another constant. Throughout our discussion we shall be con- 
cerned mainly with operators which represent a linear transformation 
on a probability variable p. From now on the general variable « we shall 
be working with is probability. These operators have the form 


DEFINITION OF L: 


(1.6) Lp = a + mp, Qcpzl. 


The variable pis defined only for numbers from zero to unity, inclusive, 


20 THE BASIC MODEL CH. | 


for there are no negative probabilities nor any probabilities greater than 
unity. The quantity Lp will also be considered a probability and so it 
also is defined only for numbers from zero to unity, inclusive, a require- 
ment that will place restrictions on the constants a and mı. Our basic 
learning operators will be like L in the last equation; p may be the prob- 
ability of a certain response on trial n, and Lp will be its probability on 
trial n+ 1. The operator L, then, describes the effects of the outcome 
of whatever occurred on trial n on the probability p. 

The question arises as to why we define L to represent a /inear trans- 
formation on the probability rather than some other transformation. The 
answer is easy: linearity is assumed in order to simplify the mathematical 
analysis. Many functions that are used in applied work can be approxi- 
mated by polynomials. The higher the degree of the polynomial, the 
better the approximation usually is. In fact many functions such as 
logarithms, sines, and exponentials can be represented by infinite series. 
From this point of view the linearity assumption we use represents the 
fitting of the first two terms of a series. To put it in symbols, if we 
regard an operation Op as expressible by a series, 


Op = à + ap + agp, 
we have taken Lp to be an approximation to the true function Op. 

The use of linear transformations in the learning model to be presented 
may seem to many readers as a very strong restriction. It is. However, 
it will be seen presently that even with these easy transformations, the 
mathematics becomes complicated and leads to many unsolved problems 
in stochastic processes. There is little hope indeed of solving these 
problems with non-linear transformations. Of course the ultimate 
justification of the linearity assumption depends upon the agreement 
between model and experiment, but, if it is any comfort to the reader, we 
hasten to point out that the whole of quantum physics is based upon an 
assumption of linear operators. (Although linear operators are basic 
in the mathematical machinery which follows, linearity is not essential 
to the general approach.) 

One further remark is appropriate at this point. In the last section, we 
mentioned the independence-of-path assumption, according to which the 
probability on trial a + 1 depended upon the probability on trial n and 
not upon how it got there, that is. not upon earlier values of probability. 
We now see the desirability of that assumption, for without it, our oper- 
ation Lp would not be a function of p alone as given in its definition 
(equation 1.6), it would also be a function of previous probabilities. 

It might be mentioned that operators having the form of A might be 


S,.C.ER T; M Bengal 


SEC. 1.5 MATRIX OPERATORS Dota ^ -eee 


used to construct a learning model. Such a moenia b ban: 
be easy to use. If the operator were applied directly to probabilities, 
there would be one rather obvious objection. Suppose a — 0.1 and the 
initial probability of making the response is p — 0.1. When the event 
occurs that makes A applicable, the probability of the response changes 
to p-- a — 0.1 +0.1 =0.2. Suppose after further learning the prob- 
ability has reached 0.9. Again the event that makes A applicable occurs, 
and the probability is changed to 1.0. The example shows that the gain 
in probability due to the event is 0.1, no matter what the original prob- 
ability was. This result is contrary to most experience which suggests 
that it is more difficult to close the gap between p — 0.9 and 1.0 than to 
close the gap between p — 0.1 and 0.2. Such a difficulty might be got 
around by operating on some variable related to p instead of p itself. 
For example, A might operate on z, where x is 1/(1 — p). Then when 
p=0.0, x= 1. With a= 0.1, one operation of A leads to Ax — 1.1, 
which when solved for the new value of p gives 0.091. When p — 0.5, 
x = 2, and one operation of A gives Av = 2.1, which yields a p of 0.524. 
This formulation then would lead to smaller increments in p the larger the 
value ofp. Remarks analogous to these could be made about the operator 
M. We shall not pursue the matter further here. We actually use M 


às a special case. 
1.5 MATRIX OPERATORS 


A mathematically convenient type of operator is a matrix, for a matrix 
can operate on a whole set of variables at the same time. Moreover, 
matrices can be used to represent all linear transformations on a set of 
variables. Matrix operators will be used in much of the analysis which 
follows since we are assuming linearity and because we are interested in 
the changes which occur in our set of r probabilities, pj = 1, 2, 7^; r). 
Because the matrices we shall use are event operators there will be a 
different matrix for each event. When an event occurs, its matrix will 
be applied to the whole set of probabilities for the r alternatives, and thus 
change all probabilities simultaneously. For the benefit of those readers 
unfamiliar with the use of matrices we shall describe some fundamental 
principles of matrix algebra.* ' 

Simple arithmetic and algebra deal w 
combinations. We know the rules of addition and multiplication of 
these numbers so well that we seldom realize that those rules are arbitrary. 
Now a matrix is simply a rectangular array of numbers, called elements of 


ith single numbers and their 


"U Library 


* For a treatment of matrix algebra, see Thurstone [6]. » 
D E 
F 2. 
[a \ P z eve r1 
\ PE N L, Calcutta è 


Zi ua ne 


22 THE BASIC MODEL CH. 1 


the matrix, as shown below: 


uU Us Uir 
Ug, Uz2 Uor 
Uy Us ctt Usp 


Each element has two subscripts: the first indicates the row of the element, 
and the second tells us the column of the element. Nothing is implied 
about combining these elements. Just because the elements are thrown 
together in a matrix does not mean that they necessarily have anything 
to do with one another. They may or they may not. Whether they do 
or not comes from considerations apart from the matrix. If the matrix has 
s rows and r columns, it has s X r elements. The entire matrix plays the 
role in matrix algebra that is played by a single quantity « in ordinary 
algebra. Matrices acquire meaning as soon as we specify how two 
matrices are combined to give a third matrix. Thus, we need the rules of 
addition and multiplication for two matrices. We shall illustrate these 
rules for three rows (s — 3) and three columns (r — 3). 


Consider two matrices U and V given by (a letter in bold-face type will 
denote a matrix) 


Uy uiz us 
U= | uy U22 Ugg |> 
Us) Uso Has 
Un U12 Uis 
V= [Ua U22 U23 
V31 V32 U33 


By definition the sum of U and V is 


un + vn Hg T Via Ura F Vig 
U+V= |ua + vq Ust Ue Uz + va 
Hi + Ug H35 T V32 U33 T V33 
For example, the elements may have the values given by 
2 1 3 0 3 1 


U=|]0 1 2], V=!17 0 o0 
4 0 6 1 5 | 


SEC. 1.5 MATRIX OPERATORS 23 


in which case their sum is 


2 4 4 
ü= 1 2 
eos y 


The rule of multiplication is a bit more complicated. Multiplication 
of two matrices A and B yields different results, depending on the order; 
usually AB does not equal BA. If AB = BA, then A and B are said to 
commute. Moreover, the product AB is defined only when the number 
of columns in A is equal to the number of rows in B. Referring to the 
3 x 3 matrices U and V given above, we obtain the element in the ith row 
and the jth column of the product matrix UV by multiplying the ith row 
of U times the jth column of V in the following special way: 


a 


Ug); F Mega; F Mis; = È Mie 
1 


x 
Therefore we have 


[CETERI sb tdyg¥a2) (itia + Matag + Maas) 


UV = | (uti + UVa + Usgda1) (Uia F Hasan + HagUg2) (Uzis F Uebas T [e 


(Ug Wirt Magar + Hostar) Ciro H HanPoa -H tsa32) (itas H Hass + U33033) 


For the numerical values given above for the elements of U and V we have 
(2x0+1x7+3Xx 1) 2x341 x04-3x5)(2x14-1x04-3x1) 


UV = | (0x041x742x1) (9x 3-1x04-2x 5) 0x 14 1x0+2x1) 
(4x04-0x 7--6X 1) (4AX3-F0x0--6x 5) (Ax 0x O-- 6X 1) 


10 21 5 
=|9 10 2 
6 42 10 


The product VU follows the same rule and is for our numerical example 


4 3 12 
vU = | 14 7 21 
6 6 19 


We may multiply a matrix U by a scalar quantity c; such an operation 


24 THE BASIC MODEL cH. | 


tells us to multiply each element of U by c. For example, consider the 
matrix U above and a constant c = 3. Then 


6 3 9 
eU = 0 3 6 
12 0 18 


A special matrix which will be of considerable interest later is the 
identity matrix I defined by (for s — r — 3): 


100 
I—-[010 
00 1 


The reader can easily verify that when I is multiplied with any other 
matrix W (for which the operation is defined) the product is identical to W. 
Although the rules of matrix addition and multiplication: given above 


were for three rows and three columns, these rules are readily generalized 
for s rows and r columns. 


RULE I: THE ADDITION OF MATRICES. If U and V are two s xr 

matrices and their sum is the matrix S, we have for the elements of S 

i=1,2,-++,s 

(1.7) 5j = ui + vij " 
j24$5,2,::5,r. 

RULE Il: THE MULTIPLICATION OF MATRICES. If U is an s X r matrix 
and V is an r X t matrix, the product M = UV is an s x 1 matrix and the 
elements are 

r i=1,2 s 
(1.8) Mij = > uaty; : : 
ud f= 1,2, >> gif 
RULE III: SCALAR MULTIPLICATION. If an s x r matrix U is multiplied 


by a scalar c, every element in that matrix is to be multiplied by c. Let 
W = cU; then the elements of W are 


i=1,2,---,5 
(1.9) Wij = Cui; . 
J= eF 
It will be recalled that the product UV was not necessarily identical 
with the product VU. Thus multiplication is not commutative for 
matrices. On the other hand, U + V = V + U, because we are just 


adding up single elements. Thus addition is commutative for matrices. 
Another important arithmetic rule has to do with associativity. If we 


sEc. 1.5 MATRIX OPERATORS 25 


add 3 and 2 and then add 6, we get the same result as adding the sum of 
2and6103. Such an associative law holds for matrices: (U + V) + W 
— U-F-(V-- W) Furthermore, the associative rule holds for multi- 
plication as well: (UV)W = U(VW). The interpretation is that if we 
first multiply V by U and then multiply W by the product matrix, we get 
the same result as multiplying W first by V and then multiplying this 
product matrix by U. Some readers may wonder whether there is any 
arithmetical operation for which associativity does not hold. The 
answer is yes; subtraction is a familiar example: 


3—[5—4]z B —5]— 4. 


Matrices are not necessarily square. In fact, a matrix may have only 


one row, in which case it is called a row vector. Or a matrix may have 
but one column, in which case it is called a column vector. (If the reader 
is familiar with the mathematical term “vector,” no further explanation 
is required; if he is not, it will suffice if he merely regards the word as a 
synonym for “matrix with one row or one column." Vectors, like other 
matrices, are indicated by letters in bold-face type.) Consider the matrix 


U given above and a column vector 


The product Ux is obtained by Rule II stated above and is 


n uiz Uig | | “i Uti F Uitg F Hass 
Ux = | tà Has Ugg | | To | = | Marts F eto + Meats 
LESI Uso U33 Xs Wat, + Hao F Ugg%3 


The product Ux, then, is also a column vector. For the numerical values 


of the elements of U given above and the vector 


2 
x= | 1 ils 
3 
we have 
14 
Ux = 7 


26 THE BASIC MODEL CH. 1 


Note that the product xU is not defined since the number of columns in 
x does not equal the number of rows in U. 

We represent our set of r probabilities, for the r response classes, bya 
column vector p: 


Pr 


In order to transform this column vector we need a class of operators. 
Such an operator is an r x r matrix T: 


My ge coo Ui 


Ug, Ugg co Up, 


tg Ug: 3 U 
When we apply this matrix operator T to the column vector p we obtain a 
new column vector Tp: 


rr 


[ apr + tapa +++ + up, 
Uy Py + Upa +` t + Uap, 


Uni + Uropa + +++ + Upp, 


(As a mnemonic device, Tp may be read as “transformed p.") Once we 
have used T to operate on p it is easy to give an interpretation of the 
elements u, They tell us how to weight up the probabilities of the various 
classes to get the new probabilities of those classes, or, more simply, they 
provide the instructions for transferring probability from one class to 
another. In the next section we use operators like the matrix T to discuss 
the special case of two alternatives A, and Ag. 


SEC. 1.6 TWO ALTERNATIVES OR RESPONSE CLASSES 27 


1.6 TWO ALTERNATIVES OR RESPONSE CLASSES 
Before continuing our general development, we illustrate the use of 
matrix operators for the simplest case of two mutually exclusive and 
exhaustive classes of responses. Actually, this amounts to much more 
than an illustration, as most of the experimental problems discussed in 
Part II involve only two types of responses. We define a probability 
vector 


(1.10) p= WE 
q 


where p and q are the probabilities of occurrence of A, and Ag, respectively. 
Note the distinction between the vector p and the element p. Here, as 
usual, we have the probability invariance rule: 


p+q=l. 


The most general square matrix operator for this case is of the form 


My Uy 
(1.11) 1-[^ al 
Us, U22 
When the operator T is applied to the vector p we obtain a new vector 
given by 
it, ‘deal [Pp up + M124 
ae bos : : | J-| i 
Moy Uo] Ld Us, p + Uae 
Before the application of T, the probability vector is p. After the 
application of T to p the probability vector is Tp given above. Since 
alternatives 4; and A, have been defined to be mutually exclusive and 
exhaustive, and the probability of occurrence of 4, is uj p + 4129 Whereas 
the probability of occurrence of Ag is ug p + ugg, We must have, then, 
according to the invariance rule, 
(ug P + M29) + mp + usq) = l, 
or 
(1.13) (Uy, + ta)p + Qs + us)q = 1. 
This equation must hold for all values of p and q consistent with the 
condition that p and q suia to unity, and so in particular for p — 1 and 
q = 0 we get the restriction 
uy + Mey = l, 
Whereas for q = 1 and p = 0, we have 
M39 + Mo» = 1. 


28 THE BASIC MODEL cH. 1 


These last two equations assert that the column sums of the matrix T 
must each be unity. (Matrices with columns whichseparately sum to 
unity are called "stochastic matrices.") We now make a change in 
notation, for reasons that will become clear presently. We let 

a = uig 

b = yy. 
These substitutions plus the fact that the columns must sum to unity 
allow us to write the 2 » 2 matrix operator T given above (equation 1.11) 


in the form 
01—b a 
T= " 
b l—a 


Now in view of our discussion in Section 1.3 on factors which change 
the probabilities, it follows that we need as many operators as we have 
classes of events. In general, we may have ¢ events and so we will have 
one operator for each. All these operators will have the form of T in 
the last equation, but the elements of the matrices will differ. For 
occurrences of the ith event (i= 1, 2, + - - , 1) we apply the operator 


on 
(1.14) T,— | ANC NN 
b; l— à, 


If we now apply one of these operators to the o 
1.10, we obtain 


Q—b)p+ a, 
(1.15) T» - i (P= do 2; 5 es d) 
bp --(1—aM 


In the analysis in the following chapters we can save considerable space 
if we denote the first element of the foregoing vector as Q,p and the second 
element by Q;q. Since the elements of the probability vector T,p sum to 
unity, Og = 1 — Q;p. Hence we write* 
Qp =(1— b)p + aq 


Qiq = bp 4- (1— as (i t) 


perand p defined in equation 


(1.16) 


* Strictly speaking, the operators Q, and Q, are not linear operators even though they 


are derived from a linear operator T, Mathematicians and physicists define a linear 
operator A as one which satisfies the two relations 


Ru + v) = Ru + Rv 
Rcu — cRu, 
where wv and v are arbitrary operands and c is a constant. 


> 


(See, for example, Davis 
[5, p. 7.) Matrices are linear operators, but our operators Q; and Ô, do not satisfy 
these relations and so are not linear operators. Of course, Q;p and Qq are linear 
functions of p and q, respectively. 


SEC. 1.6 TWO ALTERNATIVES OR RESPONSE CLASSES 29 


The terms on the right side of these equations may be rearranged in 
various ways which are useful for different purposes. First, if we let 


(1.17) = 1 — aj — bi 


and use the relation q — 1 — p, we can write the operation in the 


SLOPE-INTERCEPT FORM: 
(1.18) Op=atap G-—L2:.10. 
It may be noted that this expression is a linear function of p, analogous to 
the expression given earlier for the operator L, where Lp — a + mp. 
Second, we may rewrite the first equation of 1.16 in the 


GAIN-LOSS FORM: 


(1.19) Qip = p + all — p) bp (452,0 

This form will be particularly convenient for the development in 
Chapter 2, but it may be helpful at this point for us to interpret the three 
terms on the right side of this equation. The first term is simply the 
probability of alternative Aj before an operator 1s applied. Because p 
can only get as large as unity, the second term, a,(1 — p. represents a gain 
in probability which is proportional to the largest possible gain, (1 =p). 
The third term, —b;p, denotes a loss proportional to the largest possible 
loss, —p, because p can only become as small as zero. Thus Q;p is made 
of three terms, the original value of p, a gain proportional to the greatest 
possible gain in p, and a loss proportional to the greatest possible loss. 
When described in this way, the operators we are using may appear more 
familiar to many readers. Increments proportional to (1 — p) lead to 
growth curves which for small values of a; approximate an increasing 
exponential function, whereas decrements proportional to p lead to decay 
curves which for small values of b, approximate a decreasing exponential 
function. Hull [7] assumed increments in “habit strength” to follow such 
a growth law. : 

Still another form of writing the function Q,pis useful. We replace a; 
in the slope-intercept form by | — a; times a new constant Ay: 


a; (01— aaj 


This equation is simply the definition of å; Substituting this in Q,p gives 


the 


FIXED-POINT FORM: 
(1.20) Qip = ap + (1 — [AY 


30 THE BASIC MODEL cu. | 


We now indicate our reason for introducing this form of writing. If it 
should happen that p = 2;, we see that this equation (1.20) tells us that 
Q;p = À. Conversely, if we require that Q;p — p, we have p= 2; 


1 


| 
NN 
j 


| 
| 
| 
| 
0 L J 
0 Ni 
p 


Fig. 1.3. Geometrical interpretation of the parameters a;, bi, %;, and 2; 


provided only that «; Æ 1. Therefore 2, is à fixed point of the operator 
Q,—if p should ever equal 2;, the operator Q; would not change p. We 
shall be interested in the fixed points 2, throughout much of this book. 

In this fixed-point form, the coefficients %; and 1 — a, can be regarded 


E X. asweightssummingtounity. When 
g eae So regarded we see that Q,p is just 
. a weighted average of d 2i 
Fig. 1.4. Linear plot of the probability p, When e is acad 0 : tg this 
showing the fixed point ,. ae “ah 
average will, of course, lie between 
p and 4. 


A geometrical interpretation of the various parameters we have intro- 
duced may be seen in Fig. 1.3. We have plotted the functi 
p. The slope of this line is «; and the intercept at p O0is 
of Q;patp = lis1— bi We also show the line y = p, the main diagonal 
of the unit square. The point at which the line O,p intersects this line 
y = pis the fixed point 44, Another way of visualizing the fixed point A; 
is shown in Fig. 1.4. Here we plot values of p along a line from 0 to 1. 
Some point on the line has the value 2,, and another point the value p. 
If p is to the left of 2; and we apply Q; to p, the new value Q;p is to the 
right of p. When pis to the right of A;, Q;p is to the left ofp. Hence the 


on Q,p against 
a; The value 


SEC. 1.6 TWO ALTERNATIVES OR RESPONSE CLASSES 31 


effect of the operator Q; is to generate a new point in the direction of 2; 
from p. If pis at 4, then Q;p is also at A, This is the reason for calling 
A; the fixed point of the operator Q;. . 
The complement of Q;p is Qiq = 1 — Q;p, and from equation 1.20 
we have (remembering that p = 1 — q) 
0. =aq+ @— a) (1 — Àj. 
The vector T;p is thus 
apt (1 — «94; | 
P ag + (1 — a,) (1 — 4) 


This vector may be written as the sum of two vectors. We remove the 


first term from the expression for each element (Rule I) and get 


op ie (= CALA | 
Tp = 5 
P= [ag] la -a90 — 2 
Next we remove the scalar multipliers according to Rule III and get 


A 
P i 
Tp = v; l^ 40-2) l A d , 


and so if we denote by A; the vector with components 4, and 1 — A; we 
have the vector equation 


(1.21) Tp =ap + (l — 2X 
EXAMPLE 1. Show that 
d; 
Aic —, 
a; + b; 


when a, + b; 0. It has been shown that 4; is the value of p such that 
the operator Q;p =p. When we write the gain-loss form 


Qip =p + a — p) — bP ` 
we can ask that the right side be equal to p: 


p+ al — p) — bip =P 


a(1 — p) = bip 
RU a; + b; à 


which is the desired result. 


32 THE BASIC MODEL cH. 1 


EXAMPLE 2. When «, = l, all values of p are fixed points of the 
operator Q;. When a; = lis substituted into the fixed-point form 


Qip — a;p4- (1— CATA 
we get 
Qip = p. 
EXAMPLE 3. Suppose a; = —1.4, b; = —0.5, find Q;p when p = 0.6. 
Qip =p - a(0 — p) — bip. 
Substituting into the right side we get 0.6 — 1.4(0.4) + 0.5(0.6) = 
0.6 — 0.56 + 0.30 = 0.34. 

EXAMPLE 4. Do Example 3 with p=0.1. We get 0.1 — 1.4(0.9) 
+ 0.5(0.1) = —1.11. Since probabilities cannot be negative, such a 
result is impossible. This example and the preceding one show that there 
are values of a; and b; that will give admissible results for some values of p, 
but not for others. In the next section we regard this situation as un- 
desirable, and outlaw any values of a, and b; and therefore any values 
of æ; and 2, that do not give admissible results for every value of p when 
Q, is applied. 


EXAMPLE 5. Let A; = 0.6, x; = 0.4, compute Q,p for p = 0.5 and for 
p — 0.7. We use the fixed-point form 


Qip = ap + (1 — x)Z,. 


When p = 0.5 we get 0.4(0.5) + 0.6(0.6) = 0.56. When p = 0.7 we get 
0.4(0.7) + 0.6(0.6) = 0.64. Asin the discussion of Fig. 1.4, the important 
point in this example is that Q,p is larger than p in the first case, but smaller 
than p in the second case, even though the same operator was applied in 
both cases. After we have restricted our parameters, it will turn out that 
application of the operator brings Q,p between p and 2, when æ; is positive. 

EXAMPLE 6. When A; = 0 the operator Q,p reduces to the same form 
as the operator M, where m = «,. We substitute 2, = O in the fixed-point 
form 

Op = «p + (1 — «i4; 
to get 
Qip = «p. 


EXAMPLE 7. Interpret 2, — 1, z; — 0. Then Q,p — 1. This means 
that no matter what p was, the probability is changed to 1. This in turn 
means that the operator Q, might correspond to insightful learning, or 
“one-trial learning," if Q, is to be applied on the first trial. Similarly 
À, — 0 and a, = 0 correspond to "extinction" in one application of the 
operator. 


j 


SEC. 1.7 RESTRICTIONS ON THE PARAMETERS 33 


EXAMPLE 8. Let d; — 0.4, «; = —0.3, p= 0.5, and find the new 
probabilities after applying Q; once and twice. Using the slope-intercept 
form 

Qip = a; + tip 
we have after one application, 0.4 — 0.3(0.5) = 0.25. The second 
application. gives 0.4 — (0.3) (0.25) — 0.325. The sequence of prob- 
abilities after 0, 1, 2 applications of Q; is 0.5, 0.25, 0.325. The important 
point to notice is that the p’s first decrease, then increase. This oscillatory 
feature is peculiar to operators with negative values of æ; In practical 
applications we have not used negative 2^5. 


1.7 RESTRICTIONS ON THE PARAMETERS 


In this section we discuss restrictions on the several sets of parameters 
we have introduced, restrictions which arise from the fact that probabilities 
must remain between zero and unily. Readers not interested in the 
derivations of these restrictions may omit this section; the results are 
summarized in Figs. 1.5 and 1.6. Similar discussions may be found in 


the literature [1, 2, 8. 9]. 
All values of probability must lie in t 


unity. The term closed interval means that zero and unity are included as 


possible values, and such an interval is designated by square brackets: 
(0, 1]. This means that p and Q,p must not be smaller than zero or 
larger than unity. This requirement places some restrictions on the 
allowed values of the parameters 4; and b; in the gain-loss form 
Qip = p 4- a(1 — p) — bp. Consider first the situation when p= 0. 
We see that we then have Qip = ai Since Q,p must be in the closed 
interval [0, 1], we must have the restriction 


he closed interval from zero to 


(1.22) Oa, ls 


Then consider what happens when p — 1: we see that this implies that 
Qip-— | —b, Again Q;p must be in the interval [0, 1] and so (1 — bj) 


must be in that interval. From this we conclude 
(1.23) Q c b, € I. 


We have obtained the foregoing restrictions on a; and b; by considering 
special limiting values of p. Therefore, these restrictions are necessary 
conditions, but we have yet to demonstrate that they are sufficient to keep 
Q,p in the closed interval [0, 1] for all values of p. This sufficiency is 
easily demonstrated, however. First, we note that for any allowed p, 
the expression p + a(1 — p)-—- bp assumes its largest value when 4; is 
as large as possible and when P, is as smallas possible. Thus, for given p 
in the closed interval [0, 1], the probability Q;p will be largest when 


34 THE BASIC MODEL cH. | 


4;— l and b;= 0. Substitution of these values gives Q,p — 1. We 
conclude then that the restrictions on a; and b; are sufficient to keep Q,p 
from getting larger than unity for all allowed values of p. Furthermore, 
we note that for a given allowed value of p, the probability will be as small 
as possible when a; = 0 and b; = 1. Substituting these values we obtain 
Q;p — 0. Therefore, the restrictions on a; and b, are sufficient to keep 
Q;p from being negative for any allowed value of p. 


Fig. 1.5b. A plot showing restric- 
tions on the parameters a, and b, 
when the function Q,p is restricted to 
have positive slope, «,. The triang- 
ular shaded area represents the 
admissible region. 


Fig. 1.5a. A plot showing restric- 

tions on the parameters a, and by. 

The square shaded area represents 
the admissible region. 


To summarize, necessary and sufficient conditions for 0 < Op: Il, 
for allp,O<p<1, areO<a,<1,0< b; <1. These restrictions on 
a; and b; may be depicted geometrically if we plot values of a; on an 
ordinate and values of b; on an abscissa as shown in Fig. 1.5a. 

The parameter «;, defined earlier by 1 — a, — b; above, also has a 
restricted range of values. We see at once from that definition and from 
the above restrictions on a; and b; that we must have 

1 Ó d 1. 


However, this condition, along with the restriction on a; is not sufficient 
to keep Q;p(- a; + ap) in the closed interval [0, 1]. For example, if 
4; — 0 and «;= —1, then Q,p = —p, and this is impossible. This 
suggests that a; and x; cannot be chosen independently, and this is indeed 
the case. We see that Q;p increases linearly with p provided that o, is 
positive. In that case Q,p has a maximum value of d; +; when p= 1. 
Hence for «; positive 
a,+a;<1 


a, <1 — a; 


or 


SEC. 1.7 RESTRICTIONS ON THE PARAMETERS 35 
So we see that if x, is positive it must be less than or equal to 1 — a,. 
Furthermore, when z; is positive, Q;p has a minimum value of a; when 
p=0. Therefore we must have 


0 — a, 
Now when z; is negative, Q,p has a maximum value of a; when p= 0 


and so 
d; s 


0 14 


Fig. 1.6. Plot showing the possible 
values of æ; and a; when æ; is restrict- 
ed to positive values. The shaded 
area represents the values consistent 
with the restrictions 

0<a <1, 

O<a,<1—a%. 


Fig. 1.6a. Plot showing the possible 
values of x; anda;. The shaded area 
indicates the values consistent with 
the restrictions. 

0<a <1, 


—a,«ax, <1 — d 


Finally, when o; is negative, Q;p has a minimum value of a; + æ; when 


p= I and so if a; is negative we must have 
0 € a, 7% 


or 
—4, € ti 


Hence, if «; is negative it must be greater than —a; We may now combine 


the foregoing inequalities to obtain 
02a; €1 
and 
(1.24) ay Ss % la; 


The range of possible values of a; and a; is shown in Fig. 1.6a. 


36 THE BASIC MODEL cH. I 


The two sets of conditions just derived, 0 < a; < 1, 0 < b; < 1, and 
0O<a; <l, ~a; «x, X l— a, will be assumed throughout the re- 
mainder of this book. 

The conditions on the fixed points 2, in the equation 


Qip = «up + (0.— 24; 
follow rather easily from our earlier discussion. We must have 
(1.25) OSASI, 
because p must also be between 0 and 1. 

Throughout most of this book we shall use one further restriction on 
the parameters, a restriction not imposed by the mathematics but rather 
by the fact that we are interested in describing learning phenomena. 
The condition is simply that x, be non-negative: 

(1.26) [Um 


This restriction means that the slope of the line Qip versus p, shown in 
Fig. 1.3, must be positive or zero. Negative values of x; lead to the 
oscillatory character illustrated in Example 8 above. Negative values 
of x; would imply that the operators transform large values of probabilities 
into small values and vice versa. These properties seem undesirable to 
us for most problems, and moreover we have never analyzed experimental 
data which required negative values of the z's. 

Now since x, = 1 — a 


; — b, as defined in equation 1.17, we see that 
our condition that z; be non-negative implies that 


ity + by T. 
We now revise our three sets of restrictions on the parameters as follows: 
I. SLOPE-INTERCEPT FORM: Qip = a; + ap, 


(1.27) eel tea ci—a, 


Il. GAIN-LOSS FORM: Q,p = p + a,(1 — p)— b,p, 


(1.28) Oca xl 0<b;<1, 0 cx ab <1. 


III. FIXED-POINT FORM: Q,p = gp 3- (1 — xj), 


(1.29) 04451, Qsayxd 


Figures 1.5 and 1.65 show the regions consistent with restrictions II and 
I just given. 

The next section deals with a generalization of the matters considered 
up to now to more than two alternatives. 
more formidable than the preceding mate 
the analysis in the following chapters. 


This section is mathematically 
rial and is seldom essential to 
It is included here for completeness. 


SEC. 1.8 


We now return to the general c 
alternatives where r > 2. 


GENERALIZATION TO r ALTERNATIVES 


31 


*1.8 GENERALIZATION TO r ALTERNATIVES; 
COMBINING CLASSES CONDITION 


ase of r mutually exclusive and exhaustive 
The r probabilities form a column vector 


fi 1,2,555,1) 


Pr 
P2 
p , 
Pr 
and for each of the ¢ possible events we define a matrix operator 
Un; Migi Usi 
Un; Hao Uri 
T ( 
Un Uri Urri 


In the last section, where we 
the necessary condition that the probabilitie 
sum to unity at all times. 
case of r classes of responses. 
be conserved, that is, Ep; = 1 
we apply the operator T, to the vector p 


The elements of this new vector give us t 
classes after the application of t 
sum to unity also. 


Ui, Higi Uri 
Us; Moo Udy i 
Uni Wi Urr i 


This leads to 
a 
N ug,iPi — 
1j-1 


r 
x 


k 


We now impose t 
We have already required that probability 


. before an operator is applied. 


he operator 


discussed the case of r= 
s of the two response classes 


his 


= 2, we imposed 


condition for the general 


Now when 


above, we obtain the vector 
Pr Nay iP 
j 
$ 
Ds > sili ; 
Hg [999 Sear 
; (12,71. 
Pr X usui 
J 
her probabilities for the r response 


the condition 


1 pah 2, 


T,, and so these elements must 


siigh 


38 THE BASIC MODEL cH. | 


This condition is a generalization of equation 1.13, which states the same 
condition for r= 2. We may interchange the order of the summations, 
for this merely demands rearranging terms. Thus 


E 


r 
È (p; X uj) = 1. 
j=l kt 


We then introduce the abbreviation 


5 
= > Uy; i 
Note that this quantity is simply the sum of the elements in the jth column 
of the matrix T;. We may now write 


P: Sap = 1. 


j=1 


The original probability invariance rule of equation 1.1 was 


A 
2p d. 
j-i 
Now in order for both of the conditions above to hold simultaneously for 
all allowed values of the p;, we must have for all values of j 


NDA 
5j; l, s HE 


i= 1; 2, gt 


This is readily seen if we combine those conditions: 


We may rewrite this as 


All the p; are non-negative and not all are zero so 


this last ion i 
satisfied only by s;,— 1. eR. S i 


: The meaning of the condition is simple: 
each column of the matrix T; must sum to unity. Incidentally, for every 
arbitrary probability vector to be transformed into a probability vector, 
it is necessary that the elements of the transformation matrix be between 
zero and unity. 

We now wish to introduce one further restriction on our operators. 
(This restriction is automatically fulfilled when we have only two classes 


SEC. 1.8 GENERALIZATION TO r ALTERNATIVES 39 


of responses, as will be seen shortly.) It arises from the arbitrariness 
with which we have defined the r classes of responses, and we believe that 
this arbitrariness is desirable. The requirement is simply this: If r classes 
of responses are initially defined and if the experimenter later decides to 
treat any two classes in identical manner, it should be possible to combine 
those two classes, thereby obtaining the same results that would have been 
obtained had only the r — 1 classes been defined initially. For example, 
if we are describing a bar-pressing experiment, we could define three 
classes of responses, pressing the bar from the left, pressing the bar from 
the right, and not pressing the bar. Such a distinction between the two 
types of bar presses might be useful for some purposes. But if the 
experimenter does not make such a distinction, we should be able to 
combine the first two classes into a single class of bar-pressing and thereby 
obtain predictions equivalent to those which would have been obtained 
by defining only two classes in the first place. Formally, we may think 
of this combining of two classes as the collapsing of an r-dimensional 
vector space into an (r — 1)-dimensional vector space. 

It should be further pointed out that we have implicitly assumed that 
our probabilities after an event were independent of the distributions of 
probabilities within classes before an event. For example, when we have 
two classes of responses and two operators Q, and. Qa, we have prob- 
abilities p and q before and probabilities Q,p and Qd or Qep and Qq 
afterwards. In the various equations for Q,p, nothing was said about 
how the probability p was distributed over various possible subclasses 
of response class 1. 

Let us first examine the implications of this restriction for 
start with three response classes and a vector 


Pi 
P= | Pe 
P3 


We wish to combine classes 1 and 2 to form 
and to represent the vector in the collapsed space by 


r=3. We 


a new class, which we label c, 


Pe 
Cp= | 0 |; 
Pa 
requiring that 
Pe= Prt Pe 


We denote the collapsed vector by Cp, and this may suggest to the reader 


40 THE BASIC MODEL cH. I 


that Cp can be obtained by operating on p with a matrix C. This is 
indeed true, and C is given by* 


1 l 0 
C—|0 0 0 
0 0 1 


Now if the ith event occurs, the column vector becomes, after dropping 
the subscripts i for simplicity, 


Uni + Uripe + uipa 


Tp = | Uapi + loops + Usap 


Uy y +> Usos + UssP3 
After collapsing the vector space by applying C to Tp we have 


(Uy, + y)py + (Uig + uss)ps + (Uy + Uoy)P3 
C(Tp) = 0 


Ug py + Uso» am Usapa 


Since the elements of this vector are to be independent of the distributions 
of probability within class c, the components of the vector of the last 
equation must not depend on p, and p, individually but only upon their 


sum, pe Hence we demand that 
Usi = Ugg = Us. 


This restriction assures that up, + appa = 


- up, + py) = upe If we 
combine classes | and 3 (instead of 1 and 2), w 


€ obtain in a similar manner 
o = Ugg = Up. 

Or, if we combine classes 2 and 3, 
Uiz = ug = My. 


These three relations reduce our original matrix operator to the form 


ui uy uy 
T= |u Us» Uy 
us us U33 


* We are indebted to Gerald L. Thompson for su 


; y : ggesting the use of the operator C. 
For a more extensive discussion see [10]. 


—— 


SEC. 1.8 GENERALIZATION TO r ALTERNATIVES 41 


We have yet to impose the condition that the columns should separately 
sum to unity, which requires that 


Una tg duy = 1 
Uy + Ugg + ug = 1 
Uy + Uy + Ugg = l. 


These relations then allow us to write 


1 — us — us u Uy 
(1.30) T= Us ] —u — us Us 
Us Us 1 — i) — ug 


We may note that this matrix can be written as 


1 0 0 us uy uy 
T= (1 — u — ug — uz) | 0 1 O| +] u ug u 
0 0 1 Ug Uz us 


If we then let «= 1 — uy — us — us and u; = (1 — 204; for j= 1, 2, 3, 
we obtain 


i 9 99 (—24  0—94 (laa 
T—-«|0 1 040—232 (ad, (I= oA 
0 0 1 (1—4Ày (1— a), (Udy 


By Rule III, we then factor out (1 — 2) of the last matrix and have left a 
matrix A defined by 


& A 4 
(1.31) isli ay Ae 
h o4 À 
We can then write T in the form 
(1.32) THa+(1—a)A. 


a direct consequence of the combining 


The form of T in this equation is 
atrix A we 


Classes restriction. When the vector p is multiplied by the m 


Obtain a vector A defined by 
Ay 


(1.33) Ap= || 


42 THE BASIC MODEL cH. 1 
Finally when we apply T to p we obtain 

Tp = alp + (1 —«)Ap 
(1.34) — ap +(1—a)a. 


The foregoing arguments can be directly applied when there are more 
than three response classes and will lead to the following general operator 
(cf. equation 1.32): 


(1.35) T — «I — (1 — )A, 
where 
(1.36) d mme A ih 


j=l 
Iis the r X r identity matrix, and A is an r x r matrix given by 


n Ay 9 A 
ds Ay * D 

(1.37) A . 
4, A, eae a 


where the elements are defined by 
(1.38) A, — ujf(1 — a). 
For the jth row of the vector Tp we obtain the 


GENERAL ELEMENT OF THE VECTOR Tp: 


(1.39) pj = ap; + (V — a); 

Hence, we see that the new value of probability for the jth class is a linear 
function of p; and not of p, fork +j. This represents a major simpli- 
fication in the form of our operators. Since the coefficient of p; is a 


constant «, we can see at once that, if we combine two classes, say the jth 
and the kth, so that 


Pe — Pi + Pr 
we obtain, after applying T, a linear function of p. alone for the new 


probability of the combined class. This, of course, is what our rule for 
combining classes demanded. 


In the preceding development we first introduced linear operators, 


SEC. 1.8 GENERALIZATION TO r ALTERNATIVES 43 


represented by matrices, and then imposed two conditions: (1) First we 
required that the operators be stochastic, which simply means that they 
always change probability vectors into probability vectors; this required 
that each column of the matrix sum to unity. (2) Then we introduced 
the combining classes restriction and showed that this leads to operators 
of the form given by equation 1.35. A considerably more general 
development can be made, as was first pointed out to us by L. J. Savage. 
It can be shown that the combining classes restriction implies that any 
Stochastic operator must have the form defined by equation 1.35. Linear- 
ity need not be assumed but follows as a consequence of the combining 
classes condition. We refer the reader to the literature for a more precise 
statement of the theorem and for its proof [10]. 

In the preceding section we derived restrictions on our several sets of 
parameters for two response classes. Similar restrictions are readily 
obtained for r alternatives. First we note that A is a probability vector 


and so its components 4; must satisfy 
(1.40) OS4<1, fers 
The restrictions on « may be obtained from the requirement 
(1.41) O<p;' <1 forall 0X p, Sl 
Where p;' is the general element of the vector Tp, shown by equation 1.39 
above. When p, = 1 we see that we must have 
0<a+(1— a); <l, 
and from these inequalities we can obtain the restriction 
-i 
e 


Sol, 


But this must hold for each value of j and so we must require that 


(1.42) Max f A] sam. 
; = 


j 

ined by letting p; = 0, but the restriction 
quality 1.42 is met. The smallest 
= l/r for j= 1, 2, ***, f ands 


A similar condition may be obta 
So obtained will always be satisfied if ine 
permissible value of « obtains when 4; 
easily shown to be 


(1.43) m ay, 


r—1- 


arily not sufficient. 


This condition is necessary but ordin: 
he additional condition that « is 


As in Section 1.7 we could impose t 


44 THE BASIC MODEI cu. | 


non-negative to prevent oscillation. However, L. J. Savage pointed out 
to us that this condition may be derived from a minor extension of the 
combining classes argument. If we believe in the principle of combining 
classes, we should be willing to break up classes of responses into arbitrarily 
small subclasses. And in the breaking-up of classes we would certainly 
want z to be invariant. But in the limit when r, the number of Classes, 
becomes arbitrarily large, condition 1.43 leads at once to 0 <x. If we 
then conceive of combining classes we obtain the condition that x be 
non-negative for any finite r. 


1.9 SUMMARY 


In this chapter we provide the framework for the analysis contained in 
the remainder of this book. The basic elements of the mathematical 
system are a set of mutually exclusive and exhaustive alternatives Ay 


As, 77. An à vector of probabilities pj, ps, * - - , p, with one component 
for each alternative, a set of mutually exclusive and exhaustive events 
Ey Ey, +++, E, and a set of operators Tj, Ty, +++, T, corresponding to 


those events. The probability p; (j= 1, 2, +--+, r) represents the prob- 
ability of occurrence of alternative A, on a trial, and a trial is defined as an 
opportunity for choice among the r alternatives. When event E, occurs, 


operator T; is applied to the set of probabilities to yield a new set of 
probabilities. 


The operators T, are assumed to be linear and so are represented by 
matrices. For the probability invariance rule to hold, the elements of 
each matrix must be non-negative and each column must sum to unity. 
(Such matrices may be called stochastic matrices.) For more than two 
alternatives (r > 2) an additional condition on how classes can be com- 
bined is introduced in Section 1.8. Restrictions on the parameters are 
discussed in Section 1.7 for r — 2 and in Section 1.8 for r — 2. 

In Section 1.6, a pair of row operators Q, and Q, are introduced in 
order to dispense with the matrix machinery when there are only two 
alternatives (r — 2). These row operators are used throughout this book, 
and so the three forms of writing the operations are repeated here: 


SLOPE-INTERCEPT FORM: Q,p = a, + xp, 
GAIN-LOSS FORM: Q,p = p + a(l — p) — bip, 
FIXED-POINT FORM: Q,p = a,p + (1 — «,)A,. 
The row operator Q, is applied only to q= | — p, and by definition 
Q4 = 1-— Op. 


, 


S ——————À 


REFERENCES 45 


The slope-intercept form and the fixed-point form are used mainly in 
Chapters 3 and 4 to simplify algebraic manipulations. The gain-loss 
form is used mostly in Chapter 2, where we discuss en interpretation of 
the operators in terms of stimulus sampling and conditioning, The 
fixed-point form is used in all applications in Part II. 


REFERENCES 


. Bush, R. R., and Mosteller, F. A mathematical model for simple learning. 


Psychol. Rev., 1951, 58, 313-323. 


. Bush, R. R., and Mosteller, F. A stochastic model with applications to learning. 


Annals of math. Stat., 1953, 24, 559-585. 


. Uspensky, J. V. /ntroduction to mathematical probability. New York: McGraw- 


Hill, 1937, pp. 1-13. 


. Feller, W. An introduction to probability theory and its applications. New York: 


Wiley, 1950, pp. 1-22. 


- Davis, H. T. The theory of linear operators. Bloomington, Ind.: Principia Press, 


1936, pp. 1-15. 


. Thurstone, L. L. — Multiple-factor analysis. Chicago: University of Chicago Press, 


1947, pp. 1-50. 


- Hull, C. L. Principles of behavior. New York: Appleton-Century-Crofts, 1943, 


p. 114. 


. Burros, R. H. Some criticisms of “A mathematical model for simple learning." 


Psychol. Rev., 1952, 59, 234-236. 
Burros, R. H. The linear operator of Bush and Mosteller. Psychol. Rev., 1953, 60, 


213-214. 


. Bush, R. R., Mosteller, F., and Thompson, G. L. A formal structure for multiple 


choice situations. Decision Processes (edited by R. M. Thrall, C. H. Coombs, and 
R. L. Davis), New York: Wiley, 1954, 99-126. 


CHAPTER 2 


Stimulus Sampling and Conditioning 


2.4 A SET-THEORETIC APPROACH 


The preceding chapter develops operators to describe the changes in the 
probabilities of various responses. The form of these operators is 
dictated chiefly by mathematical considerations—linear operators are 
chosen in order to make the theory more manageable. It is seen in later 
chapters that, even with this simplification, the theory becomes rather 
complicated. The only restrictions on these linear operators are those 
to assure that probability is conserved and to allow classes of responses 
to be combined in a sensible way. We omitted any explicit discussion of 
the stimulus aspects of the learning problem. From the point of view of 
psychological theory this omission may represent a serious gap. In this 
chapter we offer one way to narrow that gap by indicating how a stimulus 
model can generate the operators already presented. This model utilizes 
some elementary notions of mathematical set theory and is a modification 
of the theory developed by Estes [1]. We wish to emphasize, however, 
that a stimulus model is not necessary to the operator approach of Chapter 
1. Rather, it is an alternative way to derive linear operators, a way we 
often found helpful in thinking about specific experimental problems. 

The organismic responses discussed in the last chapter occur, of Course, 
in the presence of or following various kinds of stimulation impinging 
on the organism. Stimulus-response psychology postulates that stimuli 
and responses become "connected," "associated," or “conditioned” 
during a learning process. Therefore it would be instructive to construct 
a formal model of this conditioning process—the "connecting-up" of 
stimuli and responses—and deduce the formulas of Chapter 1 which 
specify trial-by-trial changes in the probabilities of occurrence of the 
various possible responses. 

The total environment of an organism in an experimental situation 
provides stimulation, but, clearly, certain parts of that environment 
provide more stimulation than other parts. Furthermore, some aspects 

46 


SEC. 2.1 A SET-THEORETIC APPROACH 47 


of the situation may tend to evoke one response and other aspects may 
elicit a different response. Therefore, psychologists have found it useful 
to talk about various parts of a stimulus situation: individual stimuli, 
stimulus elements, stimulus complexes, etc. Certain aspects of the 
stimulating environment may be under experimental control whereas 
others may be constant or allowed to vary at random. In any case, it is 
useful to break up the total stimulation into parts. Gestalt psychologists, 
however, have long argued that we must be cautious in describing separate 
parts of the environment; patterns and relations are important in in- 
fluencing behavior. With considerations such as these in mind we shall 
attempt to represent the total environment of an organism by a set of 
elements. These elements may be defined physically or physiologically 
or behaviorally, as parts or wholes, as things or relations, as stimuli or 
Gestalten, depending upon the problem we face. In the following 
discussion we shall simply call these elements "stimuli" without further 
specification. It happens that this lack of specificity gives rise to no 
difficulties because our final results involve neither the properties of 
stimulus elements nor numbers of such elements. 

It has already been suggested that different parts of an organism's 
environment have more importance than other parts in influencing 
behavior. What stimuli have how much control over a specified kind of 
behavior is a complex experimental problem. Indeed it would be fair 
to say that this question has been and will continue to be a central problem 
in psychology. To discover the stimulus units that control behavior 
even in the most specific situations requires technical skill, extensive 
empirical observations, and good intuition. Although no mathematical 
model can be expected to answer these numerous quantitative questions, 
it is possible to describe the skeleton of the problem in mathematical terms. 
We would like to assign a weight or measure to each of the elements in the 
set of all stimuli. For example, if we were to assign a measure of 2 to 
stimulus element s, and measure 4 to element sy, we would then say that 
59 Was twice as important to the organism as sı. (The same statement 
would apply if we assigned a measure of 10 to s; and a measure of 20 to sa.) 

Mathematicians have developed a good deal of machinery under the 
title of “measure theory." To these mathematicians, measure is some- 
thing applied to—a number associated with—a set. It is a function defined 
on a set, satisfying the property of additivity. This simply means that 
the measure of two distinct sets is the measure of one set plus the measure 
of the other. In the stimulus model to be described, we shall associate 
with any set S of stimulus elements a measure function denoted by 
"fl (S). 

In the preceding discussion, it was implicitly assumed that the measure 


48 STIMULUS SAMPLING AND CONDITIONING CH. 2 


M(S) of a set S of stimuli was a fixed property of that S and independent 
of the possible responses to which S might be conditioned. The stimulus- 
response connections are yet to be specified in the model. It is therefore 
assumed that a stimulus element can exist in one and only one of a number 
of states. Corresponding to each class of responses, A}, Ag, +++, there is 
a unique state, and so we speak of an element as "conditioned to response 
class A;"; this is equivalent to saying that the stimulus element is in the 
state corresponding to response class 4;. Although a stimulus element 
can be in only one state at any instant of time, a set of elements may be 
partially conditioned to several response classes. (Some readers may 
feel that there would be stimulus elements which could not be conditioned 
to any of the response classes defined in a particular experimental situation. 
If this were so, it could be argued that such stimuli had zero measure—they 
would have no influence on the behavior being studied.) 

In the next section we discuss some of the basic concepts of mathematical 
set theory, and in later sections we continue the development of the stimulus 
model. Our goal is to derive the linear operators postulated in Chapter 1; 
once we have done this we shall return to our original approach and make 
no further reference to the stimulus model, except for brief discussions in 
Sections 5.2 and 11.7. In the psychological literature, however, the 
reader will find other applications of similar stimulus sampling models. 
The authors used such a model for discussing stimulus generalization 
and discrimination [2]. and Estes and Burke have presented a more 
general stimulus model to handle problems in stimulus variability [3]. 
In addition, Estes has applied his original model to provide a description 
of spontaneous recovery [4]. 


2.2 SUBSETS AND THEIR COMBINATIONS 


In this section we present some of the elementary notions and operations 
of mathematical set theory (sometimes called the algebra of classes). 
In Fig. 2.1 we show a set S containing a number of elements represented 
by dots. Within S we show three subsets. U, V, and W. These are 
called subsets of S because all the elements of U, for example, are contained 
in S, but not necessarily all elements of S are contained in U. When two 
sets or subsets have no elements in common, such as U and W in Fig. 2.1, 
those subsets are said to be disjunct. " 

Sets may be combined in various ways, and the operations involved 
correspond to such operations as addition and multiplication in simple 
algebra. The set sum of two sets is defined as the set of all elements 
contained in one set or the other or both. For example, the set sum of 
the subsets U and V in Fig. 2.1 is the set composed of all elements belonging 
to either U or V or both. The set sum of U and V is usually denoted by 


| 
| 
| 
l 


i. 2.2 
SEC. 2.2 SUBSETS AND THEIR COMBINATIONS 49 


U U V and is also called the join or union of U and V. The symbol U 
is often called cup. 

The set product of two sets U and V is defined as the set of elements 
contained in both sets and is written U (CY V. This set product is also 
called the meet or intersection of U and V; the symbol N is read cap. 
If two sets are disjunct, such as U and W in Fig. 2.1, their set product is 


Fig. 2.1. A set S of elements, represented by dots, with three subsets, U, V, 
and W. 


an empty set, that is, a set containing no elements, denoted by 0. So we 


have U A W — 0. 
The union of a set U with itself is of course the set U. 


of a set U with itself is also the set U. In symbols 
UU U=U 
UN U= U. 


a set U is defined as the set of all elements in 
is denoted by U’. From this definition 


. and the intersection 


The complement in S of 
Sbutnotin U. This complement 
we can immediately infer that 


UO U'-0 
and 
uUU'=S. 
serts that the meet (set product) of U and its 
that is, U and U’ are disjunct. The 
(set sum) of U and its complement in 


The first equation simply as 
complement U’ is the empty set 0, 
second equation says that the union 


S is equal to the whole set S. 
In the various set diagrams presented in this chapter, we find it con- 


venient to represent subsets by areas, shaded in various ways, rather than 
by collections of dots. The position of the various dots in Fig. 2.1 was 
unimportant for our purposes and similarly the relative positions of the 


50 STIMULUS SAMPLING AND CONDITIONING CH. 2 


areas in other diagrams is meant to imply nothing about the positions of 
stimuli in an actual stimulus situation. 


2.5 PROBABILITY AND STIMULUS SAMPLING 


When there are just two response classes in the model, there are usually 
some stimulus elements conditioned to each class. Let A, represent the 
class of responses recorded by an experimenter and let the subset of 
stimuli conditioned to that class be denoted by C as shown in Fig. 2.2. 
All other responses are represented by Ay 
and all stimuli in S but not in C are con- 
ditioned to Ay. The entire set is denoted 
by S. On a particular trial or at a par- 
ticular time, an organism perceives a sub- 
set, X, of the total stimuli available. This 
subset XY is called a sample from S, and 
it may partly intersect the subset C con- 
ditioned to A,, and may partly intersect 


Fig. 2.2. A stimulus set S with the 


subset C, which is conditioned to x 5 "a 
response A, and a sample X. the complement of C in S, that is, the 


subset of elements conditioned to Ay 


(unshaded in Fig. 2.2). 
It is now postulated that the probability of response A, is 


MEAG 
= wy ^ 


where “(XY A C) is the measure of the intersection of Y and C and M(X) 
is the measure of the sample XY. Equation 2.1 asserts that the probability 
of response 4, is equal to the ratio of the measure of the elements con- 
ditioned to A, in the sample to the measure of the entire sample. 
Intuitively, the probability is the relative importance of the elements 
conditioned to A, in the sample, compared with the total importance of 
all the elements in the sample. The value of p obtained here can be used 
to determine the response by analogy with a randomizing device such as 
the spinning disc described in Section 1.2. 

We next introduce an assumption of homogeneity: it is related to the 
notion of random sampling. The assumption requires that the fraction 
of measure of elements conditioned to 4, in any sample from the set S 
is equal to the fraction of measure of elements conditioned to A, in S. 
Hence, we have the 


(2.1) 


ASSUMPTION OF HOMOGENEITY: 


MXOC) MC) 


(2.2) P=- MX) AS 


dee 


SEC. 2.4 DEDUCTION OF THE OPERATORS 51 


A fluid model gives us an easy interpretation. of this assumption. 
Suppose that the total situation is represented by a vessel containing two 
fluids which do not chemically interact but are completely miscible. For 
discussion let the substances be water and alcohol. The weight of the 
water corresponds to the measure of the subset conditioned to As, and 
the weight of the alcohol corresponds to the measure of the subset C 
conditioned to A,. The subset X corresponds to a jigger full of the 
mixture, and of course, if the fluids are well mixed, the fractional weight 


Before event After event 


and after the event. A and 


Fig. 2.3. A sample X of stimuli from the entire set before 
he elements conditioned 


B are two subsamples of X. The shaded portion represents t 
to Ai. 


of alcohol in a jigger full is much the same as that in the whole highball. 
Following the same sort of assumption, a standard method of determining 
the hemoglobin content of the blood proceeds by assuming that the 
mixture is homogeneous. Twenty cubic millimeters of blood are with- 


drawn from the patient, and the concentration of hemoglobin in the 


sample is determined. 1t is assumed that the concentration obtained is 


the concentration of hemoglobin in all the patient's blood as it passes by 
the point of withdrawal. 


24 DEDUCTION OF THE OPERATORS 

are described by changes in the measure of 
we next postulate a procedure for describing 
these changes. It is assumed that a subset 4 of the sample X becomes 
completely conditioned to A,, whereas another subset B of X becomes 
completely conditioned to Ay, responses, after an event has occurred. 
This situation is illustrated in Fig. 2.3. Since an element can be con- 
ditioned to only one class of responses, A and P are disjunct, that is, have 
no elements in common. Furthermore, the measures of 4 and B depend 
upon what event has just occurred. After this reconditioning has taken 
place the sample is returned to the population S. In terms of the fluid 
model discussed in the preceding section, the procedure corresponds to 
this: Two jiggers, A and B, are removed from the highball mixture; the 
Jiggers need not be of the same size. Jigger 4 is emptied, and an equivalent 


Changes in probabilities 
elements conditioned to 41; 


52 STIMULUS SAMPLING AND CONDITIONING cH. 2 


weight of pure alcohol is returned to the highball; for jigger B an equiva- 
lent weight of water is returned. 

Initially the measure of C is./(C). An event will change this measure 
an amount A./7/(C). The part of A which was not in C has been added 
to the part of S conditioned to A,; its measure is./(A4) —.//(4 Q C). 
The part of B which was in C has been removed from the part of S con- 
ditioned to A,, and its measure is./Z/(B (Y C). Hence we have 


AC) — MA) — M (AO C) — (BC) 
Q.3) | na MBOC) 
lia) |) EX dE. 
m MA) a ae 777 


provided that neither denominator vanishes. We now extend our 
assumption of homogeneity so that 


MAC) M(BOC) MC) 


p MA) JB) o MS) 


Hence, equation 2.3, representing the change in the measure of C, becomes 


(2.5) ALM(C) = MANI — p) — (B). 
We then define 
(2.6) 7 MP (A) - M(B) 


MSY” MSY 
so that after dividing equation 2.5 through by. “(S) we write 


AMC) 


27 = 
e MS) 


aft — p) — bp. 
The new value of probability is then 


(2.8) Op = p+ Ap — p+- a(l - p) — bp. 


This equation is the gain-loss form given by equation 1.19 of the last 
chapter and so we have deduced from the stimulus model the operators 
in Chapter 1. The definitions (2.6) of the parameters a and b in terms of 
sets of stimulus elements may provide readers with an additional inter- 
pretation of the meaning of those parameters. As we saw in equation 
2.5, the increment \. 7(C) in the measure of elements conditioned to A, 
had two parts. The first part, . 7(A)(1 p) represents a gain in the 
measure of elements conditioned to A, and is parallel to the gain term, 
all p), in the gain-loss form of Qp. Similarly the part, (B), 
represents a loss in measure and is parallel to the loss term bp. 


SEC. 2.5 HOMOGENEITY AND COMBINING CLASSES 53 


*25 EXTENSION TO r RESPONSE CLASSES; HOMOGENEITY 
ANU COMBINING CLASSES 

The foregoing theory of stimulus sampling and conditioning was 
developed in terms of two classes of responses, but the extension to r 
classes is straightforward. Again we shall exhibit correspondence 
between the operators of Chapter | and the set-theoretic model. At 
some point in the process, let S have r subsets, Cj, (f= 1,255 **s 7) 
conditioned to the r response classes. A sample X is drawn, and the 
probability of response 4; is 


MX Oc) MG) 


= (fx Teh 
AX) MS) 


(2.9) P= 


We then let Y contain r disjunct subsamples, U;, which become conditioned 
to the response classes indicated by their subscripts when an event occurs. 
An (r + Dst class in X consists of the residual elements not contained in 
the U;; these residual elements are left unchanged in the reconditioning 
process. If we are describing the changes in the measure of elements 
conditioned to the jth class, the measure of the elements in U; and not 
in C, are added to the measure of C; that is, there is an increment, 
M(U;) — MU; n Cj. Moreover, the measure of elements in any other 
C; will be subtracted. This leads to decre- 


subset U, which are also in ; 
that is, for all k # j. 


ments, .//(U, O Cj), from all subsets except U;, 
The change in the measure of C; is then 
AMC) = MIU) — MU; 0 C) — X MUN C) 
(2.10) n 
— AU) — X MU; ^ Cj. 


Our assumption of homogeneity becomes 

JARU,O C) C), k= 127 
MIU, MS) 

We then define 


(2.11) 


(2.12) 1 = ee (12,7. 


Combining the last two equations then leads to 
(2.13) MAU, O C) = pel Ur) = pl CS). 


Inserting these relations in equation 2.10 expressing the change in the 


t2 


54 STIMULUS SAMPLING AND CONDITIONING CH. 


measure of C;, and then dividing through by .//(S), we obtain 


r 


(2.14) Ap; = uj — Y ugpy. 
[un 
The new probability of the jth class is 
T 
B; Apy = (1 — X uj) pj + uj = op; + uj 

k=1 

(2.15) 
= ap; + (1 — «)4;, 


where as before we have x = 1 — X u, and u;— (1—x)À 


K1 " 

This result is the general element of the vector Tp given in Chapter 1, 
expression 1.39. It is worth noting that expression 1.39 was developed 
as a consequence of our rule for combining classes; without that 
restriction, it would have been much more complicated. 
that our rule for combining classes automaticall 
zation of Estes' stimulus theory. 


Therefore we see 
y falls out of this generali- 


2.6 SUMMARY 


In this chapter we have deduced the event operators, first introduced 
in Chapter 1, from a simple model of stimulus sa 
This model is described in terms of mathematical set theory. It is assumed 
that any stimulus situation can be represented by a set of abstract elements 
and that elements exist in one of r states, each state corresponding to a 
class of responses. In these terms, a conditioning process involves 
changing the states of elements in the set. The probability of a response 
class is defined in terms of the states of elements in a sample drawn on a 
trial. After an assumption of homogeneity in the sampling process is 
introduced, the operators of Chapter 1 are deduced from the model, In 
Section 2.5 it is shown that the operators deduced from the set-theoretic 


model for r response classes agree with those which result from the 
combining classes restriction given in Section 1.8. 


mpling and conditioning. 


REFERENCES 
1. Estes, W. K. Toward a statistical theory of learning. Psychol. Rev., 1950, 57, 
94-107. 


. Bush, R. R., and Mosteller, F. A model for stimulus generalization and discrimina- 
tion. Psychol. Rev., 1951, 58, 413-423. 
3. Estes, W. K., and Burke, C. J. 
Rev., 1953, 60, 276-286. 
4. Estes, W. K., personal communication. 


A theory of stimulus variability in learning. Psychol. 


CHAPTER 3 


Sequences of Events 


31 INTRODUCTION 


Let us review briefly the basic framework of the mathematical system. 
We have a set of alternatives A;, which are to be identified with classes of 
responses, and a set of probability variables p; corresponding to those 
alternatives. In the simplest case, which is of major interest in the 
following chapters, there are two alternatives 4; and A, with probabilities 
of occurrence p and 1 — p, respectively. As described in Chapter 1, 
these probabilities are to be changed from time to time by mathematical 
operators which correspond to occurrences of events. These events in 
the mathematical system are to be identified with various kinds of events 
in the real world which influence the behavior of an organism, events 
which produce learning. These events may be almost entirely at the 
disposal of an experimenter; for example, they may correspond to 
rewards and punishments. On the other hand, events which alter 
behavior may be partially controlled by the organism whose behavior 
is being studied. An occurrence of a response and whatever stimulus 
changes it may produce may constitute an event, or perhaps a reward 
may be contingent upon the occurrence of a particular response. 

In nearly all learning experiments, we are less interested in the effect 
of a single event than in the cumulative effect of a sequence of events. 
In some experiments, a single event occurs repeatedly on a series of trials. 
An example is the Graham-Gagné runway experiment [1] discussed in 
Chapter 14; a rat finds a reward at the end of the runway on each experi- 
mental trial. In other experiments, an event such as reward may occur 
periodically; food might follow every fourth emission of a particular 
response. Such a schedule of reinforcement has been called a fixed ratio 
schedule [2, 3]. In this example, we would consider a rewarded response 
one type of event and an unrewarded response another type of event. 
Thus fixed ratio reinforcement involves a systematic sequence of two 
types of events. In still other kinds of reinforcement schedules, called 


55 


56 SEQUENCES OF EVENTS cH. 3 


random ratio, reward occurs a fixed proportion of the time, but in a more 
or less random sequence of rewards and non-rewards [3]. Strict random- 
ness is seldom used, but randomization within blocks of trials is common. 

Because effects of events are represented by the application of operators 
and because the events occur repeatedly, we are concerned with sequences 
of operators that are applied to the probability variables. In this chapter 
we analyze various kinds of sequences of operators, sequences that corre- 
spond to sequences of events in common use by the experimental psycholo- 
gist. Before analyzing these operator sequences in detail we first discuss 
the structure of some simple sequences of events to focus attention upon 
one important property: the degree of dependence of a given event upon 
earlier events in the sequence. There are three sequences of events 
described: a random sequence, a somewhat dependent sequence, and a 
completely systematic sequence. Corresponding to these three sequences 
there are operator sequences, and results for these various cases are 
obtained. Finally, it turns out that such operator sequences are related 
to three Kinds of experiments: (1) where the experimenter controls the 
occurrence of events, (2) where the subject controls the events, and (3) 
where the experimenter and subject together control the events. 

In connection with one of the event sequences described, we present 
an elementary discussion of the theory of Markov chains and how they 
are related to the general model. This discussion is closely related to an 
exposition by Miller[4]. (The reader may also wish to consult two papers 
by Miller and Frick [5, 6] for somewhat different analyses of the sequential 
properties of learning data.) 


3.2 THE STRUCTURE OF SOME ELEM 
SEQUENCES 


This section introduces some easy sequences that illustrate differing 
degrees of dependency. By dependency we mean the amount of infor- 
mation earlier members of a sequence provide about later members, 

Suppose there are two types of events A and B, and th 
temporal order a sequence of fifty-one events. Below are examples of 
three types of sequences that might be observed, If they are examined, 
certain important similarities and differences will be discovered. The 
sequences are to be read from left to right; they are given in groups of 
five for convenience in reading. so the separations have no meaning. 


ARY 


at we record in 


SEQUENCE | 


ABABB ABABA BAABB AAAAA AABBB 
AAAAA BAAAB BABAA AABAA AAABA A 


THE STRUCTURE OF SOME ELEMENTARY SEQUENCES 57 


N 


SEC. 3. 


SEQUENCE 2 
AABAA ABAAA AABAA BAABA BABAB 
ABABA ABAAA BAABA BAAAA BABAB A 


SEQUENCE 3 
AABAA BAABA ABAAB AABAA BAABA 
ABAAB AABAA BAABA ABAAB AABAA B 


Examination of these three sequences reveals that if we disregard the 


arrangement and merely count the number of type B events, each sequence 
has i7 (one-third) B's. In the first sequence the B's occur in a haphazard 
order. In the second sequence no B's occur together. In the third 
sequence, not only do no B’s occur together, but given two 4s in succes- 
sion, a B is bound to follow; and just as inevitably a B will be followed 
by two A's. These three sequences illustrate three of the many possible 
degrees of dependence that sequences can have. 

The first sequence was constructed by letting the probability of an A 
on each trial be 2/3. Then a long sequence was put together with the aid 
of a random number table, and we lifted out the first consecutive sequence 


of length 51 that had 17 B's. 
as constructed by starting with an A and letting 


The second sequence wa i 
the probability of a B following an A be 0.5, and the probability of a B 
following a B be zero. Then such a sequence was constructed with ran- 
dom numbers, and the first consecutive sequence of length 51 containing 
17 B's was chosen for display. 

The third sequence was arranged with a completely systematic pattern 
of two A’s followed by a B. If we know the first two elements of the 
third sequence and the rule of formation we can state the entry in the 
1175th position (to choose a position quite far from the starting point) 
without difficulty, and if we know the last two elements that have occurred 
we can always predict the next element. : 

Sequence 2 has in common with sequence 3 the property that if the last 


element was a B we know the next element is an A. But the resemblance 
stops here. The next element in sequence 2 has a 50-50 chance of being 
And 


a B, and we are still certain of the next element in sequence 3. 
knowing only the first two elements of sequence 2 we would at best be 
able to say that the probability of an A on the 1175th trial is about 2/3. 
Sequences 1 and 2 have these properties in common. If they were 
extended infinitely, knowing the early elements helps us very little in 
predicting elements far from the start. But knowing an element of 
sequence 2 we can make a differential prediction of the next element, 
depending on which element has just occurred; if a B has just occurred 
an A is next, if an A has just occurred 4 and B have an equal chance of 


58 SEQUENCES OF EVENTS cH. 3 


occurring in the next position. But for long* sequences of the type 
given by sequence | above, knowledge of the previous value is of no 
real assistance—each trial is independent of the previous one, and all 
that can be said is that the probability of an A on the next trial is 2/3. 

One way of analyzing sequences (sometimes called time series) arises 
from considering what length of run of values contributes to accurate 
prediction of the next entry. In our sequence 1, with a knowledge of 
the rule of formation, complete knowledge of the sequence up to the 1174th 
entry does not assist us in our prediction of the 1175th. In sequence 2, 
knowledge of the last entry was helpful, but knowledge of the entire 
previous sequence is no better than knowledge ofthelastentry. Sequence 
3 illustrates a case where knowledge of the last two entries improves the 
prediction, but it differs from the others in that knowledge of any two 
adjacent previous entries, along with their entry number, gives complete 
information about any entry in the sequence. Actually knowledge of 
one B and its entry number is adequate, but two A’s are needed. We can, 
of course, construct situations where the last three, the last four, etc., 
members of a sequence provide all the information there is for predicting 
the next entry. This is a common approach to time series analysis. 

When previous entries provide no information about the next entry 
other than about the long-run proportions of A’s and B's, we are said to 
have independent trials (example: sequence 1). When the last entry pro- 
vides all the available information about the next entry we have what is 
commonly called a Markov chain [7] (example: sequence 2). In Section 
3.11 we have more to say about Markov chains. 


3.3 REPETITIVE APPLICATION OF A 
SINGLE OPERATOR Q, 


If we should have a sequence of identical events, we need to compute 
the effect of a repetitive application of a single operator. In this section 
we consider only two response classes, 4; and As, with probabilities p 
and 1 — p, respectively. The single event Æ, is then represented by the 
row operator Q; which we write in the fixed point form of equation 1.20: 


(3,1) Qip = ap + (Y — a;i) 


If the probability of response A, before event E, occurs is p, then Q;p is 
the probability of that response after E; has occurred. Hence, if there is 


* We say "long" because in a finite sequence constructed to have a fixed proportion of 
A's and B's some information is available from knowledge of the values that have 
previously occurred. This is like sampling without replacement from a finite urn of 
known composition. We wish to sample with replacement and to avoid introducing 
this special point into the discussion. 


SEC. 3.3 REPETITIVE APPLICATION OF A SINGLE OPERATOR Q; 59 
i 


another occurrence of E, the operator Q, must be applied to the prob- 
ability Q;p. We accomplish this second application of Q; by letting 


(Q,p) be the operand instead of p in equation 3.1. We have 
(3.2) Q(Q;p) = 2 Qip) + (0 — aA. 


We then replace (Q;p) on the right side of this equatior 
given by equation 3.1 and have 


n with its equivalent 


Q3) Q(Q;p) = «bsp + U — aA] + (1 — 224; 


=azZp + (1 — x 3. 


his double application of Q; to p by Q?p. Now 


It is convenient to denote t 
ced to compute Qp: 


if event E; occurs for a third time we n 
Qp = Q(GQ?p) = ai(Orp) T (1-— aA; 
(3.4) = «,[a2p + (1 — apl + (1 — ALA 


The forms of Qip, QFP- and Qjp suggest tha 
result for any number 7 of applications is 


(3.5) Q/'p =o 


Assuming that this is the correct 
by mathematical induction* that it is correct for n + 1, because 


Q(Q/p) = 2 QP) + (O — xs 
= e [x p + a= aig Ul = aA; 


=afp+(l— a P); 
t the general form of the 


"p+ (= a)i 


form for n applications we can show 


(3.6) 
rp (dl — ate 


The fact that the right side of equation 3.6 is identical to the right side of 


equation 3.5 except that 7 has been replaced by n+ 1 proves that con- 


jecture 3.5 is correct. 


ee G. Birkhoff and S. MacLane [8]. 
induction" which they give as follows 
h each positive integer K 
nd secondly, for 
(This quotation 


* For discussion of mathematical induction S! 
We are using what they call the “principle of finite 
(with notation slightly changed): “Let there be associated with 
à proposition P(k) which is either true or false. If, firstly, PCI) is true a 
all i, PU) implies P(n + 1), then P(A) is true for all positive integers yi 


is used with the permission of the Macmillan Company.) 
Induction proofs are particularly good when we have a good guess about the 


correct answer and want to test its truth or falsity. In our problem we know P(1) is true, 
because here P(1) is the definition shown in equation 3.1. The Pin) we have chosen is 
shown in equation 3.5. In equation 3.6 we have shown that we can go from an arbitrary 
number to the next higher by legitimate operations and still get the same formula; 


and this completes the proof. 


60 SEQUENCES OF EVENTS cH. 3 


When z; is less than unity in absolute value, we see that, as ; tends to 
infinity, then z,” tends to zero. Hence the asymptote of Q,"p is 4,: 


(3.7) lim Op=4, (-l<2,<1). 


(Read: the limiting value of Q;"p as n tends to infinity is /,.) 

It will be recalled that when 2, was first introduced in Section 1.6 it was 
shown that Q,2, = 4,, and so 7, was called the fixed point of the operator 
Q,. We have just shown that 2, is also the /imit point of the operator Q;: 
that is, 2, is the point which Q,"p approaches as n gets large for any 
value‘of p. 

The results of this section say that, if the same event occurs on every 
trial, it is possible to compute the response probability on any trial and 
that this probability stabilizes at some fixed value. In Fig. 3.1 we give 
an example of the values of Q,"p plotted against the trial number 7. 


"d "LT 


08 


| BEN 


0 =a a S A LEE EN || 
O 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 
Trials, n 


Fig. 3.1. Values of Q,"p plotted against trial number n. Equation 3.5 and the values 
p 01,2 0.8, and x, 0.9 were used in plotting the curve. 


Although the derivation in this section was in terms of response prob- 
abilities and event operators. this was an unnecessary restriction. Else- 
where we shall have occasion to use these mathematical results in other 
connections. and so we now indicate the more general result, 
recursive formula of the form 


If we have a 


(3:8) Tyr = Br, + (01 — Bin 
the solution is 
(3.9) r, = fPrg + (0— pope 


Equation 3.8 is analogous to equation 3.1 above, and solution 3.9 is 


SEC. 3.4 ALTERNATIVE APPROACHES 61 


analogous to solution 3.5. When the magnitude of f? is less than unity, 
v, tends to jt as n approaches infinity. 


*14 ALTERNATIVE APPROACHES 


In understanding the operators it may help some readers 
in finding the general form of Q;"p we are actually solvit 
equation of the form 
(3.10) Q,'!p — Qp = (1 — a, Aj. 
Equation 3.10 is linear with constant coefficients, and represents quite a 
simple case. Such equations are handled in full generality by C. Jordan 
[9], whose book the reader may consult for more details. Because we 
have solved this equation by induction, the difference equation derivation 
is not given here. 

Another approach to this problem is through a differential equation 
approximation. If we think of y= Q/'p. then y+ Ay = Qip, 
where the independent variable is n. When we go from n to n 4- 1 the 
increment in n is An (which happens to be unity). Now our recursion 


relation can be written 


to realize that 
ng a difference 


(3.11) Q;"!p- «,Q;"p + (1 — adi 
and, subtracting Q,"p from both sides, we have 

s Qrp — Qp = 0 — 20. — Qr. 
or 


yt M) 93 a4 — y. 


An An 
we may approximate this last equation 


(3.13) 


If we think of n as continuous, 


Ay sri " esi 
i i - yative — t tain 
by replacing the ratio ^is by the derivative I In this way we ob 


the differential equation 


(3.14) a d — aM; — Y) 
an 


which has the solution 

(3.15) A aa uL re Min 

on to Q,"p and has the same form 
Where we formerly had o" we now 


3,) is small we can take «; as an 
4-3). Hence we conclude that the 


This equation gives an approximati 
as the exact formula, equation 35 
have the exponential. When (1 
approximation to the exponential e 


62 SEQUENCES OF EVENTS cH. 3 


solution of the differential equation is a reasonable approximation to the 
exact formula for Q;"p when (1 — z;) is small compared to unity. The 
approximate formula does yield the correct values when s — 0 and when 
n tends to infinity. 

We have included the foregoing discussion of the differential equation 
approach mainly because it has been a commonly used device in psycho- 
logical literature [see, for example, 1, 10, 11]. However, in linear problems 
in this book we have found it better and easier to solve the difference 
equation directly. 


*3.5 REPETITIVE APPLICATION OF 
MATRIX OPERATOR T 


This section generalizes the results of the last section to any number 
r of alternatives when the combining classes restriction of Section 1.8 


is imposed. According to equation 1.32 the matrix operator T; has the 
form 


1 0 0 An: ie Aa 

0 1 0 Ais Are fe. 
(3.16) T, =a; AL ie) 

a ae À Air Žir 


= al + (1 — oA 


In other words, T, is a linear combination (or weighted sum) of two 
matrices, the identity matrix I and the matrix A, We now wish to 
apply the matrix operator T, to the probability vector p. One application 
will yield a vector T,p. We then apply T; to Tp. that is, multiply the 
vector T,p by the matrix T, to get a vector T(T;p). But according to the 
associativity law for matrix multiplication, discussed in Section 1.5, we 
get the same result by first multiplying T; by T; and then multiplying p 
by this product, namely, T(T;p) = (T,T)p. More generally, if we wish 
to apply T, repeatedly, say n times, to p, we may do so by computing 
(T,T,:-:: Tp where there are n matrices T, in the parentheses. The 
matrix product (T,T; --- T;) is denoted by T", and it is called the nth 
power or iterate of the matrix T;. Our task then is to compute the powers 
of T,- 
We first compute the second power or square of T;: 


BI) — TP TT,-bIc0-20A]BE c (0 —2)A. 


SEC. 3.5 REPETITIVE APPLICATION OF MATRIX OPERATOR T 63 
plication we usually must be careful 
product because we saw in Section 1.5 
lly non-commutative. In the present 


In computing the indicated multi 
about the order of the factors in à 
that matrix multiplication is genera 
case there is no trouble. Thus, 
(3.18) Tg = «P + a(l — «EA, + «(1 — 20A + (1 — 4 A2. 
number of observations. First, 
—Jforn > 0. Second, since I is 
Finally we compute the square 


This result is simplified by making a 
the square of I yields just I: in fact I^ 
the identity matrix, IA; = Al— A. 


of A: Á 
^a Án Án An hin sius Ain 
his his his ^ia Aig his 
A? = 
dir Air ves Air Air Àir dir 
(3.19) = » 
Rad Au dad Au VEM Rad ^ü 
j j j 
Ad Ay —AeL his Aa, Ais 
j j j 
Ss 
hind Ži Aird hij hind hij 
j j i — j 
But from Section 1.8 we know that 
r 
(3.20) S Ages ls 
j= 
that is, that the vector A; is à probability vector, and so its elements 2; 


sum to unity. Hence we have the result that 


(3.21) A= A. 
Equation 3.18 now becomes 

T2=a71 + 2e Tes x) T (Is ay As 
3.22 oos ‘ 
(3.22) =al + (Les ad) 


64 SEQUENCES OF EVENTS cH. 3 


The computation of higher powers of T, follows very easily now. By 
the same procedure we get 


(3.23) TS = eT 4-(1 — «2)A;. 

An induction proof like the one used in Section 3.3 gives 
(3.24) T =a" + (1 — 2;")A;. 
Finally, when we apply T;" to the vector p we obtain 
(3.25) TP = «p (1— 27). 


and this, of course, is exactly the form that might have been conjectured 
from our work with the Q,. However, had we not imposed the combining 
classes condition, we would not have obtained this simple generalization. 
The vector A; may now be interpreted as a limit vector of the operator T; 
provided that |z,| < 1, since T,"p tends to A, as n becomes large. 


3.6 COMMUTATIVITY OF THE OPERATORS, 
Q; AND Q, 
lu later sections we handle more complicated sequences than those 
given thus far, but first we inquire about the commutativity of the operators 
Qı and Qy. For example, if Q, is applied when reinforcement is given 
and Q, is applied when reinforcement is withheld, would their order 
make any difference? Would it matter whether the reinforcements 
were scattered randomly through the sequence or all bunched together 
at the beginning or atthe end? If it does not matter, the operators should 
commute. If the order makes a difference, the operators do not commute. 
(Such questions were introduced in Sections 1.4 and 1.5.) First we apply 
Qı, then Q,, using the fixed-point form of the operators: 

(3.26) Q20ip = OA Qip) = «oap + (1 - 23)A,] + (1 — ty)Ay 
= Htp + tall — a)i + (1 — a)Às. 

On the other hand, applying Q, first and then Q, gives 


(3.27) QiQsp = zx, p + a(l — aa) + (1 — 93)À,. 
The difference between these two results is 
(3.28) (QQ: — Q.Q,)p zl a(l — 93)(24 — Ay): 


Now if this difference is zero, the operators Q, and Q, commute, and we 
obtain a considerable simplification in our theory. For example, all 
sequences of these operators which contain m applications of Qı and n 


of Q, terminate in the same value of probability if the order of the events 
is irrelevant. 


SEC. 3.7 COMMUTATIVITY OF THE MATRIX OPERATORS T; AND T, 65 


. From the right side of equation 3.28 we see that Q, and Q, commute 
if any of the following three conditions holds: 


(a) % = l; 
(b) x4 = 1; 


(c) Ay — 4s. 


Il 


Condition c implies that Q,"p and Qo”p tend to the same limits as 7 gets 
large; this case will be of particular interest in Chapters 8 and 11. Con- 
ditions a and b imply that either Q,p = p or Qsp = p. respectively, that 
is, that one of the two operators is the identity operator. In Chapter 8 
onsequences of such a condition, and in 


we examine the mathematical c 
data. 


Chapter 10 we apply this case in handling some rote learning 
*17 COMMUTATIVITY OF THE MATRIX 
OPERATORS T, AND T, 


We now generalize the results of the last section to 
for the commutativity of two matrix operators T; and Ty. 


these operators in the forms 
T, = ai 4- (1 — oy) Ay, 
T, = aI + (1 — 9) As. 


obtain the conditions 
We write 


(3.29) 


We first compute the product Ri: 
T.T, = 


(3.30) 
aol + a(l — e) A + all 


ty), + (1 — %)(1 oy) AM. 


We have used the fact that IA; = AI = A,. Similarly, 


ily = 


(3.31) 
I+ a,(1 — ag) My + ao(1 — ay) Ay 


Aka + (1 — al — oy) Ay s. 
We then take the difference of these two product matrices: 
= (1 — eX — do Ay Ae — A, A). 


(3.32) T,T,— Tel, 
of both A, and A, 


Now as a consequence of the fact that each column 
sums to unity, it is easy to show that 

(3.33) A, A, = A. 

and 

(3.34) ALA, = Ax 


66 SEQUENCES OF EVENTS CH. 3 
Therefore, 
(3.35) TT, — TT, = (1 — aX — «3A, — A,). 


If this expression is equal to a matrix containing all zeros, T, and T, 
commute. As in the last section we have three cases: 


(a) x, = 1, which implies T, = 


(b) % = 1, which implies T, = I, 


(c) A, E As 
This last case requires that 
(3.36) Ay=Asy f= 1,2,*++, 7. 


And this in turn implies that the limit vectors A; and A, are identical. 


3.8 THE SYSTEMATIC SEQUENCE (Q,'Q,")" 


With certain kinds of schedules of reinforcement, two classes of events 
such as reward and non-reward occur in a systematic sequence such as 
that illustrated by sequence 3 in Section 3.2,. For example, an experi- 
menter may want to reward three out of every ten trials on the average 
but may not want to allow very long runs without rewards. So he may 
use the same sequence on each block of ten trials, giving reward on the 
third, fifth, and tenth trials in each block, for example. We represent 
such a repeated sequence of events by a systematic sequence of operators 

1 and Q,, and it is straightforward to compute the effect of such a 
sequence on the appropriate probability variables, 

In this section we illustrate such Systematic sequences of operators 
by analyzing in detail a rather simple sequence. Suppose event £, 
occurs u times followed by v occurrences of event E,, and that this sequence 
is repeated n times. An example is the sequence 


FLEE,  EEQ, EE, EEE, E, EE», 


Here we may take the subsequence E,E,E, as basic. 
u = ] times, event E, occurs v = 2 times, and the su 
‘n = 5 times. If we write out the Sequence of operato 
corresponding to the above sequence we have 


02020102020; 22020,02020;0.020,p 


(3.37) = Q£010;0, OQ, Q010 0p 
= (Q Qj». 


Note that the operators are written in the reverse order of the occurrence 


Event E, occurs 
bsequence occurs 
TS operating on p 


EN 


SEC. 3.8 THE SYSTEMATIC SEQUENCE (Q,"Q,")" 67 


of the events. The most recent operator is farthest from p when written 
in the extended form at the left. 

The general operator of the type given above is (Q,”Q,")", or the 
subscripts may be reversed if Q, is applied first. We would like a general 
expression for this systematic type of operator. A general expression 
can be derived by making successive use of our previous result (equation 
3.5) obtained by applying Q, successively many times. First let us 
evaluate 0,"Q,"p: 

Qs" Qi"p = «s'(Qi"p) + (I — 2525 
(3.38) = es [np + (1 — 64)4] + (1 — a)i 
= og as!p + as (1 — e"), + (1 — o'a 


Note that this result is linear in p and so could be written as a new operator 


G.39) Qusp = CnaP F (L — kuou, 
where 
up = "e", 
(3.40) as'(1 — o5") + (1 — es) 
"up 1 = Oy "xs" " 


We apply Q, , to p a total of n times, and we get 


(3.41) 2 p = dup H (0 — tuvu 


If a, | is not unity, the asymptotic result is, of course, A, , as defined above. 
It is worth noting the behavior of 2,,, when u or v becomes large. First, 
if u is fixed and v tends to infinity, we have 


(3.42) lim App = 4p argh 


ton 


On the other hand, if v is fixed and u tends to infinity, we get 


(3.43) lim Aj» = #2" + (1 — ag”) t x 

ua © P 
The asymmetry in this pair of results comes from the fact that Q: is 
applied after Q, in each block, and that the more recent applications 


usually have relatively greater effects on the final outcome. 


le where 
In Table 3.1 we have tabulated the asymptote for an examp 
` " x co, with Ay = 0.75, ĉa = 0.10; 


u = ], and v takes on values 0, 1, 2, -> 5, th tot 
% = 0.6, ay — 0.9. Clearly the asymptote starts at i t (ias 5 
of Q,"p, and gradually decreases to the asymptote of Q,"p as the number 


i i ion increases. 
of Q, operations occurring after each Q, operation 1 


66 SEQUENCES OF EVENTS cH. 3 
Therefore, 
(3.35) T,T, — TT, = (1 — %)(1 — (A, — Aj). 


If this expression is equal to a matrix containing all zeros, T, and T, 
P q g 1 2 
commute. As in the last section we have three cases: 


(a) «, = 1, which implies T, = I, 
(b) æa = 1, which implies T, = I, 


(8 A= Bs 
This last case requires that 
(3.36) u= Ra J-Ll2,-k 


And this in turn implies that the limit vectors A, and A, are identical. 


3.8 THE SYSTEMATIC SEQUENCE (Q,"Q,")" 


With certain kinds of schedules of reinforcement, two classes of events 
such as reward and non-reward occur in a systematic sequence such as 
that illustrated by sequence 3 in Section 3.2. For example, an experi- 
menter may want to reward three out of every ten trials on the average 
but may not want to allow very long runs without rewards. So he may 
use the same sequence on each block of ten trials, giving reward on the 
third, fifth, and tenth trials in each block, for example. We represent 
Such a repeated sequence of events by a systematic sequence of operators 
Q, and Qs, and it is straightforward to compute the effect of such a 
sequence on the appropriate probability variables. 

In this section we illustrate such systematic sequences of operators 
by analyzing in detail a rather simple sequence. Suppose event £, 
occurs u times followed by v occurrences of event Ey, and that this sequence 
is repeated n times. An example is the sequence 


EEE; EEE, EEE E,E,E, EEE: 


Here we may take the subsequence E,E,E, as basic. Event E, occurs 
u = ] times, event E, occurs v = 2 times, and the subsequence occurs 
`n = 5 times. If we write out the sequence of operators operating on p 
corresponding to the above sequence we have 


Q5050105050:0:050:0:0501050:01p 
(3.37) = 020:070:070,020, 2 Qip 
= (Q2?Q,)°p. 


Note that the operators are written in the reverse order of the occurrence 


SEC. 3.8 THE SYSTEMATIC SEQUENCE (Q,°Q,")" 67 


of the events. The most recent operator is farthest from p when written 
in the extended form at the left. 

The general operator of the type given above is (Q2°Q,")", or the 
subscripts may be reversed if Qs is applied first. We would like a general 
expression for this systematic type of operator. A general expression 
can be derived by making successive use of our previous result (equation 
3.5) obtained by applying Q, successively many times. First let us 
evaluate Q,"Q,"p: 


Q;"Q,"p = a, (Qi'p) + (I — «5s 
(3.38) = a [np + (1— 289A] + ( — dy" Jag 
= "asp + tll — o"), + (1 — xg An. 


Note that this result is linear in p and so could be written as a new operator 


(3.39) On uP = up 07 uM 
where 
uw a" a" 
iin , Lucam E ces 
ud l= "TE 


We apply Q,,, to p a total of n times, and we get 


Gal) mp =a gp + (1 = Felner 
tis, of course, Ay,» aS defined above. 


If, , is not unity, the asymptotic resul 0 
when u or v becomes large. First, 


It is worth noting the behavior of Ay,» 
if u is fixed and v tends to infinity, we have 
(3.42) lim Ayo — A» %2 zl 


p> 0 


On the other hand, if v is fixed and u tends to infinity, we get 


(3.43) lim 4, — ag + CL ag"), 


Us o 


ey #1. 


The asymmetry in this pair of results comes from the fact that Q is 
applied after Q, in each block, and that the more recent applications 
usually have relatively greater effects on the final outcome. 

In Table 3.1 we have tabulated the asymptote for an example where 
u = 1, and v takes on values 0, 1, 25°" "+ 9, co, with 4, = 0.75, 4, = 0.10, 
% = 0.6, «p = 0.9. Clearly the asymptote starts at 0.75, the asymptote 
of Q,"p, and gradually decreases to the asymptote of Q;'p as the number 
of Q, operations occurring after each Q; operation Increases. 


68 SEQUENCES OF EVENTS CH. 3 
TABLE 3.1 


Asymptotic values of (Q,"Q,)"p for 
A = 0.75, 29 = 0.10, 4, — 0.6, and z, — 0.9. 


v Asymptote 


0.75 
0.61 
0.51 
0.44 
0.38 
0.34 
0.30 
0.27 
0.25 
0.23 
0.10 


8 o0 5-10 t 3 0 l9— 0 


Using the approach outlined above it is possible to examine the effects 
of any sequence of operators that is repeated over and over again, and 
generalization of these results to more than two events is straightforward. 
When the combining classes restriction of Section 1.8 is employed, a 
further generalization to more than two classes of responses is also 
straightforward though somewhat tedious. 


3.9 EXPERIMENTER-CONTROLLED EVENTS 


Sequence | in Section 3.2 is an example of a sequence of two events, 
A and B, which occur in random order but with a fixed proportion of A’s 
and B's. Such a "random ratio" Sequence of rewards and non-rewards 
might be used in an actual experiment. In those experiments we would 
consider reward of a particular response, when it occurred, to be one 
event (£,) and non-reward of that response, when it occurred, another 
event (Ej). If we further assumed that no other events influenced the 
animals’ behavior we could apply the mathem 
section to describe such experiments. 

Another illustration is the Brunswik T-maze experiment [12]. In the 
Introduction we described a special case of that experiment in which a 
rat always was rewarded for turning right and never rewarded for turning 
left. We assumed that a rewarded right turn and an unrewarded left 
turn had the same effect on the probability p of a right turn on the next 
trial. Hence we implicitly assumed that a rewarded right turn 
unrewarded left turn constituted a single event class E, with 
operator Q,. Now in the Brunswik experiment, 


atical machinery of this 


and an 
associated 
turning right was 


SEG. 3:9 EXPERIMENTER-CONTROLLED EVENTS 69 


rewarded a fixed proportion, 7, of trials when the rat went right, and 
turning left was rewarded some other fixed proportion, 75, of trials when 
the animal went left. Suppose we make an assumption similar to the 
one described above: Assume that an unrewarded right turn and a 
rewarded left turn have the same effect on p and so constitute an event 
class E, with operator Qs. (In Section 3.13 we present a more general 
analysis which does not require these strong assumptions which may or 
may not be appropriate. The position here is that, if one cared to make 
these assumptions, the analysis of this section would be applicable.) 
Furthermore, let z, 4-75 — l; this is not true in all Brunswik-type 
experiments, but it has often been true. For example, with one group 
of rats Brunswik chose 7, = 0.75 and s, = 0.25. Now let us see what the 
probability of event £, is under the above assumptions. The probability 
of a rewarded right turn is p7, and the probability of an unrewarded left 
turn is q(1 — 7) = (1 — p)z,. Hence the probability of E, is pz, + 
(1 — p)m — m, Similarly, it is easy to show that the probability of 
The four possible response-outcome pairs are 


E, is a= 1 — tp 

summarized below: 

Response Outcome Operator Probability of Occurrence 
right turn reward Qı pri 

right turn non-reward Qo PU = 74) = pre 

left turn reward Qs qra = (1 — pr 

left turn non-reward Qi qu — 7) = (1 — pori 


ally go farther than we have indicated. He 
ace food in the right box and no food 
in the left box (event £j) and on (1 — 7,) of the trials to place food on 
the left and no food on the right (event Ej). Thus on no trials will food 
be placed on both sides, and on no trials will food be absent from both 
sides. 

In these illustrations the probabilities 7, and 7. are fixed by the experi- 
menter. Hence we call this case the case of experimenter-controlled 
events. The mathematical problem is to compute the effect of applying 
operator Q, with fixed probability 7; and operator Qs with fixed prob- 
ability ma = I — m,» This is the task of the present section. 

Consider two operators Q and Qs for two events; suppose that they 
are applied successively with fixed probabilities 7 and m, respectively, 
where z, + ma — l. If this is done it no longer makes sense to speak of 
an organism's probability p of making response A, at the end of n events, 
unless by this we mean that we are told the actual sequence of events that 
did occur after the chance sequence had been determined. For example, 
in the T-maze example described above, one animal might obtain five 


An experimenter may actu 
may decide on 7, of the trials to pl 


70 SEQUENCES OF EVENTS cH. 3 


occurrences of event £j in five trials and another might obtain only two 
such occurrences, depending upon the precise sequence of events for cach 
animal. At the end of five trials there are 2? — 32 possible values of p 
that an organism might have for a fixed initial value. So rather than ask 
what the value of p is at the end of a number of trials, we ordinarily 
would inquire about the average probability p of the response for a large 
number of organisms. This is not the result for an average organism. 


Application 0 1 2 3 
number 3 
QP 


Q,’p p diio 
m M. 0,0) p 


Tam 
QP 
Ti 


Q, 2.2, P 


QQ; p "T ci Ti TRY 
Toi CESAR. Q,!Q, p 


2 
Valueofp p Ter 
Proportion P 1 
Q/Q,p 


2 
ito" E 
Tf 9491 8;p 


Ta T To 
Qp 
Be 


QiQ;'p 


2 
P TT 
Qp nm di yo 
2 
Ta , 
SESS Qp 
mj 


Fig. 3.2. The successive splits that a large group of animals goes through after three 

events when the probability of applying operator Q, is 7, given an operator is to be 

applied (experimenter-controlled events). The proportions in the groups written 

beneath the p value of the group have been written to parallel the Q's raid than in 
the simplest form—thus under Q;Q;Q.;p is written 757,7; rather than mi7. 


There usually is no average organism in the sense that that organism 
would ever assume the mean probability except for z, — 1 or 0. 

Let us start with an initial value p. The proportion of animals with Q; 
applied first is 7, whereas that with Q, applied is m. Thus.there are 


sEC. 3.9 EXPERIMENTER-CONTROLLED EVENTS 71 


now two groups of animals that have been treated alike up to this point, 
the Q, group and the Q, group, and their sizes are the proportions 7; 
and 7», respectively. After the next event there are usually four groups. 
The proportion in the Qj? group is 7°, that for the Q,Q, group is 7,75 
that for the QQ, group is 717» and that for the Q," group is 7,°. The 
tree diagram (Fig. 3.2) shows the sequence of events together with their 
probabilities for the first three applications of the operators. 

From an examination of the tree diagram it should be clear that the 
proportion of animals in a given group is known as soon as the number 
of times each operator is applied is known. In particular, if Q, has been 
applied u times, and Q, has been applied v times, the proportion in the 
group is 7,"72"; this proportion applies to every rearrangement of the 
operators with wu applications of Q, and v of Qs. 
ve fraction of organisms that will make 


We are interested in the averag 
response 4, after n applications of operators. The situation after n 
ost. The groups are homo- 


applications is that there are 2" groups at m 

geneous in the sense that all animals in a group have the same probability 
of response A,, but different groups have in general different probabilities. 
If we think of the groups as ordered after a particular number of appli- 
cations of the operators, We can call p, the probability of response Ay 
in the th group and P, the proportion that the group is of the total 


population. The average contribution of the »th group to the total that 
would make response 4; is just the product p,P,; and therefore the 


average fraction ñ, for the whole population is the sum of such products 
over all the groups in the population, that is, 


(3.44) 


This result is just the mean of the distribution of p,. - 
Let us calculate some means after successive applications of the 


operators, using the slope-intercept form of the operators, 
(3.45) jy—pxl-P 
(3.46) py = m Op + 7202P 

= ma, + tap) + its + esp) 

— (ma, + mad) + (mts t Taft) 
The fact that f, can be written as à linear function of p suggests the 


slope-intercept form: 


(3.47) ji = ate, 


72 SEQUENCES OF EVENTS cH. 3 


where 
ü = 7,40; + Too, 


(3.48) 


Z = Thy d- 3X». 


Note that à and Z are the averages of the a's and x's of the operators 
when weighted by their frequencies of occurrence. This way of writing 
p, suggests in turn the conjecture: 


(3.49) p, Rp (1 — 2"), 

where 

(3.50) es 
1—2a 


To verify this conjecture let us call the probability of the rth group on 
the nth trial p,, and the proportion of the population in the rth group 
P, On the (n + Dst application of the operators this group will split 
into two groups, with new values of p and proportions as follows: 


New Values of p New Proportions 
Qip = Aj T % Pn mP,, 
Qe Pyn = Ay + %Pyn TP 


To get p,,, we must multiply the new values of p by the new proportions 
and then sum over all the groups: 


gn 


Pasa = > InPu(ay + Py) + 72Pu(aa + Aap) 


ve 


(3.51) 


vn? vn* 


= (m;a; + 7305) X. Pry + (my + Toto) © pP. 


D 1 


Naturally the sum of P,, over v is unity, because it represents the total size 
of the group, and by definition 


(3.52) XpsP,, = p, 


leading to the difference equation 


(3.53) Puii = MHA + Tas + (mx, + 72%)P, 
â +- &p,,. 
We already know the solution to this difference equation, because it was 


obtained in Section 3.3. The solution is given by the conjecture 3.49, 
and so we have verified the conjecture. 


SEC. 3.11 SIMPLE MARKOV CHAINS 73 


These results for fixed probabilities of applying the operators show that 
for the purpose of obtaining means we could define an expected operator 
Q by 
(3.54) Op = à + p 
that behaves just like the Q's (cf. Section 6.4). 


3.10 EXPERIMENTER-CONTROLLED EVENTS WITH 
t OPERATORS AND r ALTERNATIVES 


The preceding results can be extended to any number, f, of events 
immediately. We merely define 


(3.55) 


Equation 3.53 then gives the correct mean probabilities. 

The analysis can also be extended to any number of response classes. 
A matrix operator corresponds to each event, and these operators generate 
distributions of probability vectors. We give the detailed discussion of 


this extension in Section 4.4. 


3.11 SIMPLE MARKOV CHAINS 

In Section 3.2 of this chapter we discussed three kinds of elementary 
sequences of events. One of these, sequence 3, was completely systematic 
—two A's and a B, two A's anda B.etc. In Section 3.8 we analyzed the 
effects of such sequences of operators Qand Qs. Sequence 1 of Section 
3.2, on the other hand, involved a random order of A's and B's but with 
a fixed proportion of each. In the preceding two sections we discussed 
such random sequences of operators. There remains, then, a need to 
discuss applications of the operators with dependent probabilities such as 
those illustrated by sequence 2 of Section 3.2. In that example. the 
probability of a B following an A was 0.5, whereas the probability of a 


B following a B was zero. We pointed out that this type of sequence was 
the simplest type of Markov chain, and we want to expand this notion in 


this section. The application of Markov chain theory to psychology 
has been discussed by Miller [4]. 

In any Markov chain involving tw Jens 
three quantities: an initial probability and two transition probabilities. 
The initial probability of event 4 is denoted by po(A), and the initial 
probability of event B is p(B). One of these events necessarily starts 


o events, 4 and B, we must know 


74 SEQUENCES OF EVENTS CH. 3 


the sequence, and so we have 

(3.56) PAA) + p(B) = 1. 

As soon as the first event occurs, the next event is chosen according to the 
two conditional probabilities. If event A has just occurred, the prob- 
ability of an A is p(A|A) and the probability of a B is p(B\A). (The 
notation p(2|y) is read “the probability of x, given that y has occurred.’’) 
If event B had occurred, the probability of an A is p(A|B) and the prob- 
ability of a B is p(B|B). We must have, of course, 

(3.57) p(A|A) + pCB|A) = 1, 

(3.58) P(A|B) + p(B|B) = 1. 

A total of six probabilities has been introduced, but the preceding three 
equations reduce the number of independent variables to three. 

If we know the three numbers PA), pA) and p(A|B) we completely 
specify the Markov process. If we know what event occurs at any point 
in the process, we know the probability of an A occurring next. However, 
we ask another question: What is the probability p,(A) on trial n if we do 
not know the events on earlier trials? This question acquires more 
meaning if we think of the population of all possible sequences that are 
generated by the specified rules. The fraction of this population that 
begins with event A is p94); of these sequences, a fraction p(A|A) will 
have an event A in the second position (trial n = 1). The fraction of the 
total population of sequences which begin with event B is Po(B) whereas 
the fraction of these that have an A on trial n = 1 is p(A|B). Thus the 
total fraction of sequences with an A on trial n + 1 is 


(3.59) Pil) = pol A)p(A|A) + pol B)p( A] B). 
For example, if po(A) = 0.2, p(A|A) = 0.7, and p(4|B) = 0.5, then 
py(A) = (0.2)(0.7) + (0.8)(0.5) = 0.54. 


Therefore, we see that the unconditional probability of an A on trial n 
may be interpreted as the fraction of the total population of sequences 
that have an 4 on trial n. For example, on trial n = 2 we have 


(3.60) PA) = py(A)p(A|A) + p(ODpCA| B). 


We have already found that p,(A) for our numerical example was 0.54. 
Thus, for that example, 


PA) = (0.54)(0.7) + (0.46)(0.5) = 0.608. 
More generally, on trial n + 1, 
Posi A) = pa(A)p(4|A) + p,(B)p(A|B) 
(3.61) = P,(A)p(A| A) + [1 — p.01 pCGA| B) 
= p(A|B) + [p(A]A) — p(A]B)] p,(A). 


SEC. 3.11 SIMPLE MARKOV CHAINS 75 


Hence, we obtain a linear difference equation in the p,(A). If we let 


p(A|B) =a, 
(3.62) 
p(AlA) — p(A|B) =o, 
we have 
(3.63) Pna(A) = a + ap,(A). 


This linear equation is an example of difference equation 3.8 encountered 
in Section 3.3. The solution is 


(3.64) P(A) = &"po(A) + (1 — 2") pA), 
where 
4) a p(A|B) 
PA) = 72a T T= pla) + pa B) 


(3.65) 
PAB) 


= GA) + pu BY 
This result is quite well known [4]. We see that, if —1 < « < +l, then 
as n — co, the quantity «" tends to zero, and so p,(A) tends to p. (A). 
In the example with po(A) = 0.2, pao = 0.7, p(4|B) = 0.5, 


0.5 5 
Ee aes 
Pld) = 53405 8 


We note that p(A) = 0.608, so that the limit is very closely approximated 
in this case after a very few trials. The speed of convergence, of course, 
depends on the magnitude of p(4|4) — p(4|8), and in the present case 


this is 0.2, and powers of 0.2 tend to zero rapidly. ; 
To return to the general problem, suppose that the starting proportions 


are the same as the asymptotic proportions, that is, 
p(A|B) 
(3.66) PAA) pa) p(BlA) ma pA BY 


If we substitute this in equation 3.64 we see that 
sg p,(A) = pA). 


Thus, when the starting condition po(A) is adjusted properly, the sequences 
have the same proportion of A events on every trial. The reason for the 
trial-to-trial change in the more general case is that the starting proportions 
differ from the asymptotic proportions, and this discrepancy can be ad- 


justed only gradually. 


76 SEQUENCES OF EVENTS CH. 3 


Newman found a simple example of a Markov chain in the written 
Samoan language [13]. If a consonant is called a B event, and a vowel an 
A event, the sequence of vowels and consonants in Samoan can be repre- 
sented by a Markov chain with p(B|B) — 0 and p(B\4) = 0.49. Thus 
consonants never follow consonants in written Samoan. By a happy 
accident, sequence 2 of Section 3.2 provides a close approximation to a 
sequence of vowels and consonants in Samoan writing. There p( BIB) =i) 
and p(BlA) = 0.5. The asymptotic Proportion of vowels in Samoan 
writing would, according to equation 3.65, have the value 1/(0.49 += 
0.671, or about 2/3. 

Simple Markov chains of the type just described are seldom used for 
constructing schedules of reinforcements, but were used, for example, 
by Hake and Hyman in a prediction experiment [14]. We do not analyze 
a Markov sequence of the operators Q, and Q, in this chapter. However, 
a more common experimental procedure leads to a considerably more 
complicated Markov chain which is the subject of the next section. 


3.12 SUBJECT-CONTROLLED EVENTS 


We now consider a problem that is of major importance to most of the 
remainder of this book. Suppose that we equate a response occurrence 
with an event in the mathematical system. For two alternatives A, and 
A» we simply let event E, be an occurrence of response 4, and also let 
event E, be an occurrence of response A». We suggest two possible 
rationales for making these equivalences. First, the Guthrian school 
of association theorists postulates that conditioning is a consequence of the 
contiguity between stimulus and Tesponse. From this position, we 
conclude that the mere occurrence of a Tesponse and the stimulus change 
it produces alter future behavior, and thus, in our language, constitute 
an event. The second rationale arises from the common experimental 
procedure of making a reward contingent upon the occurrence of a 
particular response. The simple T-maze experiment described in the 
Introduction is an example. If the rat turns right, reward is always 
found; if the rat turns left, reward is never presented. The two responses 
are turning right (4,) and turning left (45). Event E, in the mathematical 
system is identified with turning right and finding reward, and event £s 
is identified with turning left and finding no reward. Note that the event 
includes the response and the outcome (reward or no reward), but the 
outcome of a trial is completely determined by the response which is 
made in this experiment. Hence we can say that E, occurs whenever A, 
occurs and that E£, occurs whenever A, occurs. For this reason we refer 
to this case as the case of subject-controlled events. 

When the foregoing equivalence between responses and events is 


SEC. 3.12 SUBJECT-CONTROLLED EVENTS 77 


assumed, the probability of event E, on trial n is no longer constant; it 
is equal to the probability p, of response 4j. Moreover, the conditional 
probability of event E, on trial n + 1, given that E, has occurred on trial 
n, is not even constant; it is equal to Q,p,, and this probability will 
ordinarily depend upon n. Similarly, the conditional probability of 
event £, on trial 1 + 1, given that E, occurred on trial i, is Qyp,. Hence 
the simple type of Markov chain described in the last section is not 
appropriate for the present problem; the conditional probabilities are 
not constant. We can, however, make the conditional probabilities 
constant by choosing the operators Q, and Q, in a very special way. If 
we let à, = % = 0, we have Q,p = A, and Qyp = Ag, that is, the proba- 
bility of an A, following an A, is 4,, and the probability of an A, following 
an Ay is Ay. For this special choice of the operators we do have a simple 
two-state Markov chain of the type described in the last section. The 
process is completely defined by the two conditional probabilities, A, 
and 2p, and the initial probability, py, of an A, response. In this sense, 
then, the simple two-state Markov chain is a special case of our general 
mathematical system. Except for this special choice of the parameters 
of the operators, a two-state Markov chain is inadequate for our present 
problem, but we now show that a more general Markov process is 


appropriate. 
In the theory of Markov chains, we talk about a number of states and 


the probabilities of the system being in these states. In Section 3.11 we 
let these states be events A and B and so we had but two states. The 
Markov chains discussed in this book have transition probabilities that 
are the same on every trial. (More general Markov chains having time- 
dependent transition probabilities are not considered, though this com- 
plication might be an asset to a learning model. Feller discusses these 
more general Markov chains [7].) The number of states may be greater 
than two and, in fact, may be infinite. The learning process referred to 
in the preceding paragraph may be considered a Markov process, provided 
we let the possible values of the response probability p represent the states 
of the system. We shall try to make this notion more explicit. Ordinarily 
we like to number the states or give them letters, A, B, C, >- But we 
shall label a particular state by specifying a value of p, the probability of 
occurrence of response 4, (or of event £j). Since p can usually assume 
any value between zero and unity, an infinite number of states is possible. 
On any trial, however, only two states can be reached from the state p, 
namely, Q,p and Qsp. The transition probabilities to all other states are 
zero. The Markov condition that transition probabilities remain constant 
requires only that those transition probabilities be completely determined 


by the existence of the system in any given state. In our problem, the 


78 SEQUENCES OF EVENTS CH. 3 


conditional probability of state Q,p given state p is simply p, the con- 
ditional probability of state Qp given state p is 1 — p, and the conditional 
probability of all other states is zero. 

The characteristic feature of a Markov process is the independence of 
path. By this is meant that the transition probabilities from a particular 
state do not depend upon how that particular state was reached. The 
mathematical process described above is Markovian because the con- 
ditional probabilities of all other states, given state p, do not depend upon 
how state p was achieved. This may be said in another way: Given that 
the probability of response A, is p, on trial n, then the probabilities of 
the various possible values of p,,, do not depend in any way upon 
Pn» Pr- ` ° $ 5 Po 

Since the probability of applying Q, to p, to obtain Pn+1 iS p, and the 
probability of applying Q, to p, is 1 — Pn» We have a branching process 
similar to the one described in Section 3.9. As before, we give a diagram 
showing the results of the first three applications of the operators (Fig. 
3.3. Asin previous sections we are interested in the average p value at 
the end of n applications of the operators. Let us compute fy and pj: 


(3.68) Po — p, 

(3.69) Pr = pOip + qQsp = ap + ep? + asg + es pq. 
And substituting q = 1 — p in this equation, we obtain 

(3.70) Py = ag + (a, — ag + eg)p + (v4 — a&)p*. 


This is the first time we have met a p that was expressed in any way other 
than linearly in the initial value ofp. Itis a great blow because the nice 
methods we have used no longer work. We note that when % = x the 
p° term vanishes, and we do have a linear expression in p. This special 
case will be handled in detail in Chapter 5. Here we consider only the 
general case when z, Æ «s. Suppose for a moment that we could use 
the expression on the right of equation 3.70 to generate P». Pz, etc., just 
as we did for fixed probabilities of application of the operators in Section 
3.9. Then the conjectured equation would be 


(3.71) Pası = d + (a, — ay + s), + (a4 — X9)p,?. 
Furthermore, 

Pus = Ga + (dy — ds + %2)P nya + (ty — ny) 
(3.72) = as + (a, — ag + 23) [ag + (a, — ay + 2:3)P,, + («4 — «3)p,?] 
F (es — 23) [a + (ay — as + 3), + (04 — os)? 


Without simplifying this expression further it is clear that Pz will have 


SEC. 3.12 SUBJECT-CONTROLLED EVENTS 79 


Application 0 1 
number E 3 
Qp 


Qip.—— Pare rp 
Q 
ML TR 


PQ, p - Q,"p) 
Qip 


p 
i AAN 
à QP c P(1= Q, 2) QQ p 
pa-R Que 


p(1-Q,pü- 
Valueofp p pe equum 
Proportion P 1 " 

Q Qp 


Q, QP eS ae aQ PQ, QP 
aQp = QQ,Q»p 
P ui QP - Q,Q,p) 
Qp 
q 


Q1 Qp 


2 l- x 
Qj» o «0- QP) Qp 
«0-8,)— Qi, 


al- QD - Q,"p) 


Fig. 3.3. This diagram shows the successive splits that a large number of organisms 

would go through on successive applications of the operators for the case of subject- 

controlled events. Both the p value of the group and the proportional size of the 

group are given at each stage of the operation. The probability of applying Q, is 
equal to the p value of the group. 


a term involving the fourth power of the starting value of p, unless % = as. 
Now let us consider what happens when we actually try to compute p, 
so as to compare with the above conjecture. From the tree diagram, Fig. 
3.3, we have expected values to compute by multiplying the p values of 
the groups by their probabilities and summing. We notice that the p 
value in general involves only a linear expression in p for any group, but 
the probability of occurrence contains as high a power of p as the number 
of times the operator has been applied. Therefore the highest power ofp 
in p is clearly the highest power that can be obtained by multiplying a 
linear expression in p by a quadratic in p; thus the highest possible power 
is p. We have already shown that an attempt to use the conjectured 
equation 3.71 must lead to terms in p*, and therefore the conjecture does 


80 SEQUENCES OF EVENTS CH. 3 


not give the correct answer. This is unfortunate because it complicates 
matters tremendously. Furthermore, the case under discussion is of 
interest in many learning situations, so that we shall have to treat it more 
extensively, and give particular study to a number of special cases as well 
as investigate methods of handling the general case. Indeed, the later 
chapters of Part I are mainly devoted to this investigation. 


3.13 EXPERIMENTER-SUBJECT-CONTROLLED EVENTS 


We have previously discussed two important cases. In Section 3.9 we 
described the case of experimenter-controlled events, for which the event 
probabilities are fixed. In Section 3.12 we introduced the case of subject- 
controlled events, for which the event probabilities were identical to the 
response probabilities. We now combine these ideas as follows. We 
idenlify an event E, with the occurrence of an alternative A; and an 
occurrence of a particular outcome O;. We consider two possible 
alternatives and s possible outcomes (k = 1,2, -- - s), and we define a 
set of constant conditional probabilities 7;,, which specify the con- 
ditional probability of outcome O;, given that alternative A; has occurred. 
The s outcomes are mutually exclusive and exhaustive and so 


(3.73) Amd] 1,2. 


Instead of the subscript / on the operator Q, we use the double subscript 
jk; we have the operators Q,, defined by 


(3.74) QinP = Ap + (1 — tjr 


This operator is applied when alternative A, occurs and is followed by 
outcome O,. Hence the probability of application of Q,; is pm, and the 
probability of application of Qs, is (1 — p)m,,. Since the conditional 
probabilities 7; are determined by the experimenter and because p and 
1 — p are the response probabilities of the subject, we call this case the 
case of experimenter-subject-controlled events. 

Particularly in Chapter 13 will we be interested in two alternatives A, 


and A, and two outcomes O, and O,. We then have four operators Qj, 
as given below: 


Alternative Outcome Operation Probability of 
Application 
A, Oj Qup up (0 ~ ayy pma 
A, 0; Qop = "yp +A Z12)Ż12 Priz 
(3.75) Ay O;  Onp -*np t+ (0 — xm — (0 — pyra 
Ay 0s QoeP "epo 0 333 — (1— preg 


SEC. 3.14 SUMMARY 81 


We have the relations 


(3.76) 


In Chapter 4 we consider the sequences of probability values generated by 
such operators. 


3.14 SUMMARY 

This chapter describes various possible sequences of events and the 
resulting sequences of operators. Three types of simple sequences, one 
with independent random elements, one with elements dependent on the 
preceding entry, and a completely systematic sequence, are introduced in 
Section 3.2. These are later used as prototypes for operator sequences. 
In Section 3.3 the effects of a repetitive application of a single operator 
Q; are analyzed; the following formula is developed for computing the 
response probability on every trial: 


(3.5) O;'p — «p + Ul — 2," )Ai- 


operators T; is solved in Section 3.5. 


The commutativity of two operators is considered in Sections 3.6 and 
3.7. lt is shown that two event operators commute—their order of 
application is irrelevant—if and only ¿f one operator is an identity operator 
or the operators have equal limit points. A particular kind of systematic 
sequence, one with repeated cycles of u events of one class and v events 
of another, is analyzed in Section 3.8; the response probabilities are 
computed for each trial from a formula similar to equation 3.5 shown 
above. 

When the event occurrences are uncertain, that is, when only probability 
statements about the event on a given trial can be made, the analysis 
becomes more complicated. We introduced three such cases which will 
be used for describing various kinds of learning data: (1) experimenter- 
controlled events, where the event probabilities have fixed values, 7 and 
m, (Sections 3.9, 3.10), (2) subject-controlled events, where the responses 
are considered to be events and so the event probabilities change (Section 
3.12), and (3) experimenter-subject-controlled events, where the events 
correspond to response-outcome pairs (Section 3.13). For the case of 
experimenter-controlled events. the following formulas are developed for 


computing the mean response probabilities p, on each trial: 


The analogous problem for matrix 


(3.49) jp, + (2% 


82 SEQUENCES OF EVENTS " €i. 3 


where p is the initial response probability, where 


& = ano, + Toko, 


(3.48) 

Ā = 7,40, + 7305, 
and 

= a 
(3.50) A= DE 


(The event probabilities are z, and 7.) This analysis does not generalize 
to the other two cases which are treated in more detail in later chapters. 

The case of subject-controlled events is related to the theory of Markov 
chains as discussed in Section 3.12, but this discussion is preceded by an 
elementary exposition of Markov processes in Section 3.11. It is pointed 
out that when the z; are zero, the case of subject-controlled events gives 
a simple type of Markov chain, and that otherwise that case leads to a 
Markov chain with an infinite number of states. 


REFERENCES 


1. Graham, C., and Gagné, R. M. The acquisition, extinction, and spontaneous 
recovery of a conditioned operant response. J. exp. Psychol., 1940, 26, 251-280. 
2. Jenkins, W. O., McFann, H., and Clayton, F. L. A methodological study of 


extinction following aperiodic and continuous reinforcement. J. compar. physiol. 
Psychol., 1950, 43, 155-167. 


3. Jenkins, W. O., and Stanley, J. C., Jr. Partial reinforcement: a review and critique. 
Psychol. Bull., 1950, 47, 193-234. 

4. Miller, G. A. Finite Markov processes in psychology. Psychometrika, 1952, 17, 
149-167. 

5. Miller, G. A., and Frick, F.C. Statistical behavioristics and sequences of responses. 
Psychol. Rev., 1949, 56, 311-324. 

6. Frick, F. C., and Miller, G. A. A statistical description of operant conditioning. 
Amer. J. Psychol., 1951, 64, 20-36. 

7. Feller, W. An introduction to probability theory and its applications. New York: 
Wiley, 1950, Chapter 15. 

8. Birkhoff, G., and MacLane, S. A survey of modern algebra. New York: 
Macmillan, 1941, pp. 11-13. 

9. Jordan, C. Calculus of finite differences. New York: Chelsea Publishing Co., 
1947, second edition, pp. 558-559. 

10. Estes, W. K. Toward a statistical theory of learning. Psychol. Rev., 1950, 57, 
94-107. 

11. Bush, R. R., and Mosteller, F. A mathematical model for simple learning. 
Psychol. Rev., 1951, 58, 313-323. 

12. Brunswik, E. Probability as a determiner of rat behavior. J. exp. Psychol., 1939, 
25, 175-197. 

13. Newman, E. B. The pattern of vowels and consonants in various languages. 
Amer. J. Psychol., 1951, 64, 369-379. 

14. Hake, H. W., and Hyman, R. Perception of the statistical structure of a random 
series of binary symbols. J. exp. Psychol., 1953, 45, 64-74. 


CHAPTER 4 
Distributions of Response Probabilities 


4.1 INTRODUCTION 


Suppose that a large group of organisms is run through an experiment 
for many trials. Furthermore, assume that different organisms make 
different responses on each trial and that different things happen to them. 
Under these circumstances the organisms will not all have the same 
probability of response 4; on a given trial. What percentage of organisms 
is more likely to make response A, than response A? Or what percentage 
of the organisms is at least 90 percent sure to make response 4,? What is 
the average probability of response 4i? Questions such as these are 
considered in this chapter. 

In the preceding chapter we described various kinds of sequences of 
events and computed the effects of the corresponding sequences of oper- 
ators. However, in Sections 3.9 through 3.13 we encountered problems 
in which the event occurrence at any position in the sequence was un- 
certain. At most we knew the probability that the event was of a parti- 
cular kind such as Ey. As a result for a large group of organisms a new 
distribution of response probabilities is obtained after each trial. For 
two response classes we had a set of possible response probabilities Pyn for 
trialn. The values of the index v were just labels for the possible response 
probabilities. Furthermore, we introduced a set of probabilities Ps 
which specified the likelihood that the various values of pyn would occur 
on trial n. In other words, P,, is the probability of occurrence of the 
response probability p,, on trial p. In order to avoid the awkward 
phrase “probability of a probability" we call the Pyn “P values" in this 
chapter. Thus P,» is the probability of the p value p,n When we speak 
of a distribution of p values, Py, is the variable of the distribution and P, 
is the density function of Pyn: Frequently we call P,, a proportion (of 
organisms) since it may help the reader to think of a very large population 
of identical organisms undergoing the same learning process. Then P, 
is the proportion of this large population that has the p value p,n on trial n. 

83 


84 DISTRIBUTIONS OF RESPONSE PROBABILITIES CH. 4 
We considered three main cases in Chapter 3: 


1. Experimenter-controlled events, 
2. Subject-controlled events, 
3. Experimenter-subject-controlled events. 


When the events are experimenter-controlled, they occur with fixed 
probabilities m; as determined by the experimenter, whereas for subject- 
controlled events, event E; necessarily occurs when alternative 4; occurs. 
For the more general problem of experimenter-subject-controlled events, 
the conditional probabilities of various outcomes, given response Aj, 
were fixed by the experimenter; but the probability of alternative ^f, 
being chosen by the subject was p, on trial n. Each alternative-outcome 
pair was considered to be an event and, as usual, an operator was associated 
with an event. In this chapter we again consider these three cases. 

In Fig. 3.2 we wrote out the exact probability distribution (for trials 
0, 1, 2, and 3) that would be obtained by an infinite population of subjects 
with two experimenter-controlled events. On trial 7 there are at most 2" 
groups of subjects who have had different experiences since the experiment 
began; therefore, writing out the complete details even for # as small as 
10 would usually involve 1024 groups. Since the explicit statement of all 
these 2" probabilities is most tedious for all but the smallest 7, we would 
like to have some general rules to calculate what portion of the subjects 
will have values of p between p, and p, + Ap, on trial a. That is to say, 
we would like to know the probability distribution on successive trials. 
The best information, of course, would be an explicit formula for the 
distribution function. Failing this we could describe the distribution 
through its moments. 

In this chapter recurrencc relations are derived for the raw moments 
of the distribution function after n trials. The derivation is given first 
for experimenter-controlled events, and the recurrence relations are 
solved for the mean and variance of the distributions. Next, the corre- 
sponding recurrence relations for subject-controlled events are developed 
and extended to experimenter-subject-controlled events. Unfortunately 
we are not able to solve for the moments explicitly in these latter two cases. 
but the recurrence formulas are used in later chapters to discuss special 
problems and to develop some useful approximations. 

This may be a good place to emphasize again the importance of studying 
the general properties of any model more deeply than specific applications 
may require. Only by understanding a model very generally is it possible 
to have a good "feel" for what it will do and how it will behave in new 
situations, and only in this way do we get a good notion about both its 
scope and its limitations. Therefore in this chapter and some others in 


SEC. 4.2 DEFINITION OF MOMENTS 85 


Part I the reader should not expect results that will be particularly useful 
in analyzing a specific experiment. Rather the hope is that this material 
will improve the reader's understanding just as it has the authors". 

Heretofore we have assumed that sequences of p values began at some 
single point pọ. We have seen how distributions of p values are generated 
by the operators even with a single initial probability. Some readers may 
have felt that this restriction to a single value of pg is undesirable—different 
an experiment with different initial probabilities, 
that is, we might reasonably expect an initial distribution of p values. 
Nevertheless, most of our analysis will be in terms of a single value of po. 
The generalization to an arbitrary initial distribution is straightforward 
but involves some mathematical complications we do not care to introduce. 
Most of the results in this chapter, however, are valid for an arbitrary 
distribution of initial p values. 

In Section 4.7 we present some 


bution of p values when the events are experimen 
subject controlled. The proofs have been omitted. Finally, in Section 


4.8 we discuss the lengths of runs of one response or another. 


42 DEFINITION OF MOMENTS 


organisms may come into 


theorems about the asymptotic distri- 
ter controlled and 


> is familiar with moments of distributions will find 


we suggest that he skip to Section 4.3. 
P,>0,7=1, 2, «+», finite or 


The reader whe 
nothing new in this section: 
If we have a set of discrete probabilities 
infinite in number, each associated with a number p,, then the mth raw 


moment of the distribution of p's is defined to be 
(4.1) y, = Xp" P. 
where, of course, 


(4.2) yy 


*14 
a) 
ll 


average or mean of the mth power of the 
si y. For example, the first raw 
It is often convenient to 
s which are moments 
of the raw moments. 


Thus the mth raw moment is the 
variable whose distribution is under stud 
moment, V4, is the mean of the distribution. 
define moments about the mean instead of the J^ 
about the origin. This can readily be done in terms 
The mth moment about the mean is defined as 


(4.3) d = X nu VDP, 


an is the average of the mth 


t the me 
When this 


In words, the mth moment abou 
ble from its own mean. 


power of the deviation of the varia 


86 DISTRIBUTIONS OF RESPONSE PROBABILITIES CH. 4 


form of the y’s is expanded we see that 
p= EB, =I, 


p = X(p,— V)P, = X p,P, — X YP, = V, — Vv, = 0, 
(4.4) 
He = (p, — VjPP, = X (p? — 2Vjp, + WP, = Vy — 2V2 + Vg? 


—Vy,— Vg 


In a similar way, the reader can verify that the third moment about the 
mean is 


(4.5) Hg = Vz — 3V4V, + 2¥,;3. 
In order to obtain a general formula for x, in terms of the raw moments, 
we can use the well-known binomial expansion of (x — y)": 
m : 
(4.6) @— y)" = > (—1)" (") aig, 


u 
u=0 


m\ . ; , , 
where (4) is the binomial coefficient equal to m!/[(m — u)!u!]. For 


example (x — y)? = a? — ry + y? and (x — yy! = a3 — 3a2y + Bay? — y. 
Using this binomial expansion in equation 4.3, we have 


m 


Bn = kj > p (5) p, "n P, 


¥ u=0 
By interchanging the order of summation (this amounts to rearranging 
terms) we have 


m 


fis kj l- I)" ("" y," > nnn) 


(4.7) E 


u 
m 


tert 


u=0 
For m = 0, 1, 2, and 3, this formula gives results which agree with our 
earlier formulas. 

Classically for any distribution the second moment about the mean is 
called the variance of that distribution, and when it is used in statistical 
problems the notation o? usually replaces j/. The square root of ə 
or a, is called the standard deviation. It is a measure of the spread of a 
distribution as opposed to Vj, the mean, which is a measure of location. 
Changing only V, moves the whole distribution, but changing c spreads 


SEC. 4.2 DEFINITION OF MOMENTS 87 


the distribution out or pulls it together. We note that o? is always positive 
or zero because it is a sum of squares with positive weights. Because p, 
is between 0 and 1, c? is always finite for our problems, as are all other 
moments. This may be seen in equation 4.1; p," is at most unity for all 


values of p, and m, and, since the P, sum to unity, we see that 0 < Vn <1 


for all m. 
It is not our intention to expand on this notion of moments in great 


detail, because they are discussed in any elementary statistics book. 
One important point that may not be mentioned in elementary textbooks 
is that knowledge of all the moments usually determines the cumulative 
distribution function, just as the distribution function completely deter- 
mines the moments. However it is usually not so easy to get the distri- 
bution from the moments as it is to get the moments from the distribution. 

If we order the groups according to their p values so that p, > p, for 


all v, the cumulative distribution is given by 
t 
(4.8) F(p) =P, t= largest v such that p, < p. 
r=1 


ases, F takes a jump each time p comes toap, 


This means that as p incre ; 
ntil the next such jump. 


with positive P,, then stays the same u 
— 0.6, P, = 0.3, P, — 0.7. Then 


EXAMPLE: p; = 0.2, pe 
0 —o <p<0.2 


F(p)= 0.3 0.2 <p<0.6 
1.0 0.6 <p zo 


In words, the cumulative distribution gives the probability that the variable 


is less than or equal to a specified value. 
Sometimes it is convenient to think of the ps 
without specifying whether it is discrete, continuous, 


Then we write the cumulative as either 


p Aa 
F(p) or li dF( p^) 

—90 
sed to distinguish the variable of integration 


from the upper limit of the integral. Although such a procedure can be 


rigorously justified, it suffices for our purpose to remind the reader that 
an integral is just the limit of a sum. Therefore if he prefers to think of 
sums instead of integrals, not only should there be no difficulty, but he 


will be essentially correct. 
We proceed now to find momen 
cases discussed in Sections 3.9 through 3.13. 


as having a distribution 
or both at once. 


where the prime on p is u 


ts of the distribution for the several 


88 DISTRIBUTIONS OF RESPONSE PROBABILITIES CH. 4 
4.33 MOMENTS FOR TWO EXPERIMENTER-CONTROLLED 
EVENTS 


When the probabilities of applying Q, and Q, are 7, and ma = | — 7, 
respectively, we have already seen (Section 3.9) that the probability of 
applying any sequence of Q's is exactly z,"7,". where wand v are the num- 
bers of times Q, and Q, have been applied, respectively. If we regard 
Pons ¥ = 1, 2,7572", as the p value for one of the 2" groups of organisms 
available after the nth random application of the operators to a single 
initial probability po, and, if the proportion in the group is P,,, on the 
next trial this group splits into two parts as follows: 


New p Value New Proportion 
Qs Prn = ay + 04 Pn 7, Py 
(4.9) 
Qop,n = Ay + 2a Prn TP yy. 


The mth raw moment on trial n + 1 will be called V,,,,,. It can be 
evaluated by summing the product of the mth power of the new p values 
with the new proportions in the manner shown below: 


on 2n 


Vma = > CO TP a +S (QP) FP on 
»-1 y 
(4.10) i 

—m X (ai F sap, )" Pyn +e S (a + py)" Pyne 


Using the binomial expansion of equation 4.6, we obtain 


K 


m,n+1 c 


3 = 
m m 

"oh TEE ME TAE, > » 5 " 

Er mma (o) a," "sy" Pin Pin H Te 4 (4) ay" "ka" Pyn Pyn 


After interchanging the order of summation we can sum on v, remembering 
the definition of moments given in equation 4.1. We get the 


RECURRENCE FORMULA FOR THE MOMENTS: 


4 — 
QD Wen. mem » [| arg pw MT 
E mak Td a oes Ve Fia d$ ^ Oz Pun 


— Z— AM 
n " 


Thus the mth moment of the distribution on trial n + 1 depends on all 
the moments up to the mth on trial n. 

We already know the mean of the distribution of p values on trial ” 
from equation 3.49; knowledge of the spread of the distribution, that is, 


SEC. 4.3 MOMENTS FOR TWO EXPERIMENTER-CONTROLLED EVENTS 89 


its variance, would be useful. Then we could know under what circum- 


stances the final distribution of p values is tightly clustered about its mean. 
Writing out the formulas for the first two raw moments on trial n + 1 
gives 

Viner = mà t ada + (my + 923) Vin 


(4.12) ‘ae ma + Taa? + 20mydy + Tolata) Vi, n 


oun + Tote") Fa s. 
can be written more conveniently as 


The formula for the second moment 


(4.13) Vosa a C, zs Gin d CVs. 


where 
(4.14) C, = Amaya + Talko), 


Co = my F Tata: 
I" 2^2 


The solution of the first of equations 4.12 is given by equation 3.49 
if we replace p, with Vim P with V, s, and 4 with Vj, We obtain then the 


EXPLICIT FORMULA FOR THE MEAN: 


(4.15) Vy = Vic (Fo VV". 
From equations 3.48 and 3.50 we get the 
ASYMPTOTIC FORMULA FOR THE MEAN: 

, = a 
(4.16) Vig m4 1 A 
where 

4 = Qd, Tede. 
(4.17) & = 7% + Toto 


With this evaluation of Vi, (equation 4.15), substituted into equation 


4.13, we get 
(4.18) Vany = (Co + C Far Cio = Vi) 


a few values of / quickly 
Let 


E s 


Writing out equation 4.18 for a gives us the clue 
to the solution of this difference equation. 
Cy = Ca + C, Vise 
(4.19) : i 
g es Os - Vio) 


so that 


(4.20) Va = Ck Aw CoV ew 


90 DISTRIBUTIONS OF RESPONSE PROBABILITIES cH. 4 


We know by definition that 


therefore 
Voy = Co + Cy’ + Cape, 
and 


Von = Cy + C, & + CA Cy’ + Cy + Copy?) 
= Cg (1 + C3) + Cj'(& + C9) + Ce2pQ?. 
Similarly, we can easily show that 
Vas = Co (1 + Cy + Co?) + C + Cah + Ca?) ++ Cy py. 
Continuing in this way, we can develop the general formula 
(4.21) Bons a 2o j a Sarg 4 CPRÁ 
u= u= 


which can be proved correct by finishing the mathematical induction. 
(See footnote in Section 3.3.) Provided that C, 4 1 and Z ^ Cy, we can 
perform the summations and get the 


EXPLICIT FORMULA FOR THE SECOND MOMENT: 
] x n S IL. n 

ai 6», e & C”) 
L—¢ z= Cy 


T 
1—G 


Von = Cy 


+ Cep) 
(4.22) 
= (Co + Ciia) 


a" — Co" 
Ci Vus Po) = C, = Crp". 


From the definition of C, (equation 4.14) we see that 0 < C, < 1 since 
™ + 75 = l and 0 <a? < 1 and 0 < <1. Also, from definition 
4.17 we see that —1 <2 <1. When 0 c C,— l1 and —1 <a<1 
much of this result tends to zero when 7n — oo and we have the 


ASYMPTOTIC FORMULA FOR THE SECOND MOMENT: 


(4.23) M e 


This result can also be obtained more directly by setting Vy... = LA 
in the second of equations 4.12 and solving for V, ,. Asymptotically, 
the variance is 


Z 
(4.24) Ta = Vo = NES oe EL 


— 


a a 


——— T 


— ——— 


SEC. 4.4 MOMENTS FOR / EXPERIMENTER-CONTROLLED EVENTS 91 


Let us consider an example with parameter values a, = 0.3, a, = 0.6, 
ay = 0.01, %& = 0.9, 7, = Tg = 0.5. 


0.5(0.31) 
Fra = 0.62, C, = 0.04505, G= 0.189, C, — 0.585. 
1,0 T —0.5(1.5) 0 1 2 


For this example the asymptotic mean is 0.62, and the asymptotic standard 


deviation is c, = 0.08. 

The results of this section, together with a reasonable amount of 
computation, can tell us a good bit about the behavior of the probability 
distribution on successive trials. In addition to the moments, however, 
it might be valuable to have the percentage points of the cumulative 
distribution in the asymptotic case. For any finite number of applications 
of the operators the cumulative can be computed with the help of diagrams 
like Fig. 3.2 However, the labor shortly becomes prohibitive because 
2" increases so rapidly. For the asymptotic distribution we can use the 


theorem presented in Section 4.7. 


*4.4 MOMENTS FOR f EXPERIMENTER-CONTROLLED 
EVENTS 
analysis of the preceding section to any 


In this section we generalize the ig section to 
asses the generalization Is direct. 


number ¢ of events. For two response cl 
If the probability of application of Q; is m; where 


t 
(4.25) Smi 
i=l 


equation 4.10 is replaced by 

t p 

S aif X (QP) Pa) 
y= 


V m,n+1 
, 


t 
EE 


Kg = X ai X (ai tp" Pat 

1 ral 
Using the binomial expansion, interchanging the order of summation of 
v and u, and using the definitions of the moments, we get 


m 


t 
am — Yams ef > (Cree 
: Pam n 


i=l u=0 


Computations similar to those in Section 4.3 give for the means 


(4.28) Van Vie — (Vro — po^, 


92 DISTRIBUTIONS OF RESPONSE PROBABILITIES cH. 4 


and, for the second raw moments, 


(4.29) 
Vs Cy + GP, Lu) = — Gh s ag t. = ) ype, 
where 
yon 8 
MC TE 
t 
(4.30) ā= Nra 


The preceding analysis can be extended to more 


classes by using the matrix operators of Section 1.8. 
the operator, 


than two response 
If event £, occurs, 


(4.31) T, =a +(1—a,)A,, 


i 


is applied to the response probability vector. 
probability 7,, and 


(4.32) N 


Event £, occurs with 


Since we assume that precisely one event occurs on cach trial. The 


remaining problem is to specify Something about the distributions of 
probabilities which obtain. 


We are necessarily involved in 


à multivariate distribution of proba- 
bility values on each trial. 


Such a distribution on any trial will have 
moments and cross-product moments, but we shall restrict our attention 
to the means of the marginal distributions. There will be 1” possible 
Sequences leading up to trial n; the oth Sequence has probability of 
occurrence P,, and the probability vector generated by that sequence 
1S p,,. We then define a vector of marginal means 


" 
(4.33) MULUS 
r=] 


This vector on the next trial is given by 


[5 t 


(4.34) Minar = F P Xm, ). 


v ^ WA a. 


DESI i 


When we apply an operator T, to p,, we get 


(4.35) Tip, = xp, + (1 — LJA; 


SEC. 4.5 MOMENTS FOR TWO SUBJECT-CONTROLLED EVENTS 93 


where A, is the limit vector of Tj. Using this in the expression for V, „41 
we have 
"m t 

(4.36) Vina = X Pant S mua F — aA}. 

rl i=l 
After interchanging the order of summation, using the above definition 
for V, , and the fact that the P,,, sum to unity, we have 

t £ 

(4.37) Visa = (X mii} Vin + X ml — €. 


lo 
i=l i=l 


As before we let 


ll 
A~ 


Tifis 


(4.38) , & 


and also define an average limit vector by 


t 
Xl — 20A 
i=l 


t 
Nail — aA; 
i=l _i= 


(4.39) A= 5 E 
X71 — a) 
ii 


We can then write 
(4.40) Vin 3i, + (0 — À. 
This vector difference equation may be solved at once by the method used 
in Section 3.5 to yield 
(4.41) Vin = "Vo + (1 = 2A. 


Provided only that lz] < 1, the asymptotic vector of marginal means is 


(4.42) Via =À. 

ated to those obtained by Neimark [1]. The 
he treated, special restrictions on 
pter 13 for discussion of 


These results are closely rel 
main difference is that in the problems s 
the æ, and A, were appropriate. (See Cha 
experiment.) 
45 MOMENTS FOR TWO SUBJECT-CONTROLLED 
EVENTS 

We consider next the special case of variable probability when there are 
two possible responses and the probability of application of Q, is identical 
with the p value at the time. We proceed in a manner quite the same as 


that used for experimenter-controlled events. . iets 
If p,, is the p value for the rth group of organisms after n applications 


94 DISTRIBUTIONS OF RESPONSE PROBABILITIES cH. 4 


of the operators, if P,, is the size of the rth group expressed as a pro- 
portion of the whole population, and if the operators are applied one 
more time, we get the following table: 


New p Values New Proportion 
Qi Pen = Ay 7 Pen PrenPon 
(4.43) 
Qa Pin = Ag + 2a Pun C1 — Pru) Prn- 


Note that the only difference between this table and equation 4.9 is 
the replacement of the constants 7, and 7; by the variables p,,, and 1 — pyn 
in the calculation of the new proportions. The mth moment on the 
(n + Dst trial is the sum of the mth powers of the p values weighted by 
the group sizes over all the groups. Thus 


2n 2n 


(444) Vomm > (Qip "p, Pr m P 25 (Op) "CE — Bay, 
v iul 


a, F t p Y Pon Pos + Slaa spa Y" (1 — Pon) Pre 


(In the previous section we factored out the z's at this point, but the p's 


cannot be factored out.) Using the binomial expansion of equation 
4.6, we have 


m 
i 
= m = m 
V mn = > > (") a "itp SH P, 2. 2 e a5" "asp, IP, 
v u=0 


(4.45) 


m 


ý »»() Agr igstp. UP 
— 7 x 


v u=0 


In view of the definition of moments, we can sum on v in each of the 
three sums on the right of the last equation to get the 


RECURRENCE FORMULA FOR THE MOMENTS: 


m m 
` m k m 
= muy u = 
Vni = (") a Oy FE eris = m ag" "ag" u+l,n 
0 


(4.46) u= u=0 


SEC. 4.5 MOMENTS FOR TWO SUBJECT-CONTROLLED EVENTS 95 


This equation can be written in a more convenient form if we collect 
together the terms in Vin Von, etc., on the right side. We need to 
determine a set of coefficients C,,,, which allows us to write 


(4.47) Viner = > Emu Yun 


" u=0 


is to discover the correct form of the coefficients. 


The problem, therefore, 
index from u to v in the first two sums on 


We first change the dummy 
the right of equation 4.46 to get 


m 


m tet "UNS 
V rini = z (”) fa," Vm -— ag” “ost M sion 
=0 
(4.48) d m 
k m 
m (" Janet Vans 


Then we let v = u — 1 and have 


ml 
m m-u+ly u-l m—utly u- 
= a — as a 
Fa "a i) {a 1 2 a} V u,n 

ucl 

(4.49) m 

fe > () ag" "a. "Vn 
u=0 


We then separate out the terms for u= 0 in the last sum and for 
u = m + 1 in the first sum and combine the other terms to get 


m 
m m-u+ly u—1l m—-u4ly u—l 
— [2 —a [a 
V mnta = da" Von + y ile E Je p 3 P 
ucl « 


(4.50) 

ais (Da Vun st (a4 x3 ag") V mirn 

u 
Therefore the coefficients C,,,, are 
| a," (u = 0) 
n 

: (, m jeu -— ag™—"Hag") + (Lancet 
4.51) C,,—4 V 


(w= 1,2,--,m) 
(u=: m- L. 


For the mean on trial n -+ 1 we have 
(4.52) Vimer = Cio + CVn + Cial s. 
where 

Cio = às 
(4.53) Cj, = a, — ag 4- €, 

Cig = 4 — Xs. 
For the second raw moment we have 
(4.54) Vanisi = Cop + Cah, B CoV an T CaV, n 
where 


(4.55) 


The result just obtained in equations 4.47 and 4.51 isa general recurrence 
relation for all the raw moments. It has one unfortunate aspect. It 
will be noted that the mth moment on the (1 + Ist trial depends on all 
the moments through the (m + 1)st on the nth trial. If we start with a 
particular value of p, at the zeroth trial and try to trace through, say, 
the mean for the first six trials, then at the first trial we need V,a and 
Vao. At the second trial we need Via and Vai, which in turn implies 
that we need Vso. At the third trial Vis and V,, generate Vir Voas 
and V, which in turn require Vio Vao Vaa and Vio So to get 
Vie we require V7. This dependence on more and more early moments 
is tedious, not because of the difficulty of computing V, j—which is a 
slight task—but because substitutions must be made into longer and 
longer equations, and more and more equations. To continue the 
example of V, s. we need a total of twenty-eight evaluations (if we regard 


obtaining a quantity like V4 as one evaluation). The quantities needed 
are 


Vino m=1,2,+++,7 
LO m=1,2,--+,6 
Vs m= 1,2,--+,§ 
Ving m= I, 2, 34 
Fa ntes 1,23 

Faa m= 1,2 


96 DISTRIBUTIONS OF RESPONSE PROBABILITIES CH. 4 


SEC. 4.6 EXPERIMENTER-SUBJECT-CONTROLLED EVENTS 97 


All these are quite apart from the values of the coefficients C,,,, that 
must be computed. Except for small numbers of trials then, or for special 
problems that are treated later, the moments are awkward to evaluate. 
However, it may sometimes be valuable to evaluate them for small n in 
some specific cases to see how the curves for V,, begin. It is probably 
about the same amount of trouble to compute the moments from the 
successive applications of the recurrence moment formulas as it is to 
compute the p values and probabilities for the 2" groups at the mth 
application of the operators and then compute the moments from their 
definition. With either procedure, rounding errors are cumulative and 
must be watched carefully. 


Even though we do not do much with the general recurrence formula 


for the moments here, it is the starting point of our investigations of 


special problems treated in later chapters. 


4.6 MOMENTS FOR EXPERIMENTER-SUBJECT- 
CONTROLLED EVENTS 
We here consider the problem introduced in Section 3.13 and compute 
the moments for the probability distribution. We restrict the discussion 
to two alternatives, 4; and Ag, with probabilities p and 1 — p. respectively, 
but allow s possible outcomes Oy The conditional probability of outcome 
O, given alternative Aj, ÎS 7jj- The operators, Q;,. are defined by 


Onp = «ap + — jpg 


(4.56) = dj + %jxP- 


ability of application of Qix is prip and the probability of 


The prob 
Furthermore, 


application of Qs; is (1 — Psi 


(4.57) 


The values of p are denoted 


On trial n there are (2s)" possible values of p. 
is denoted by P,,. 


by p,n, and the probability of the sequence up tO Pin 
Hence the mth raw moment on trial n + lis 


(asy s 
S A ips m 
Pama = 2 {PenPon ` mal Qu.) 


+ (1 = Pyn)Pon 5 Tu Qoia") 
k-1 


(4.58) me . V 
= X (pP X multis T 3x 
1 


v= 


+ (1 — Pon) Pen b Toplar + ond" )- 
ron 


98 DISTRIBUTIONS OF RESPONSE PROBABILITIES CH. 4 


As before, we expand the expressions in square brackets by the binomial 
expansion and get the general 


RECURRENCE FORMULA FOR THE MOMENTS: 


N m 


m 
=n Mtn up 
m,n4l = () UTA Zig" h u+l,n 


V. 
(4.59) 


k=1 u=0 


muy u 1 
F Torlar” "gn ue = Balls 


In particular, for the means we have 


s 
= T 

l,n+1 = Unda FA, E ithon] 
fy 


+ Toilao,(1 =n Vo) jr 2v Pima = Le 


> Ion = Torlar E 71342194] Fi g 
bel 


+ LS (mans — Tortor) V. 
k=1 


2,n* 


For compactness in writing we define 


T jk jk. pot, 2, 


(4.61) E 


uo Je 1,2; 
1 
and so we can then write 


(4.62) Visa = d, + (dj — ay + 83) Vi, (24 — Ba) Van . 


The similarity of this result to equation 4.52 may be noted. 


47 THEOREMS ABOUT THE p 


In this section we State several theor 
values. Rigorous Proofs of these theore 


-VALUE DISTRIBUTIONS 


TRAPPING THEOREM FOR r = 
within which p values are ultim 
operators having unique limit 


2. This theorem Specifies an interval 
ately trapped. First, consider two event 
points and non-negative «’s, that is 


> 


er 


SEC. 4.7 THEOREMS ABOUT THE p-VALUE DISTRIBUTIONS 99 


operators for which 0 < x, < 1. The theorem asserts that if the p value 
of any sequence is ever between the two limit points the sequence forever 
remains there: this part of the theorem is intuitively obvious, for neither 
operator can take a p value outside the interval. The theorem also 
asserts that, when the sequence is outside the interval, it will later enter 
that interval with probability 1, provided that the probability of applica- 
tion of each operator is never zero. For more than two event operators 
with unique limit points and non-negative z's, the same statements apply 
to the interval between the largest and smallest limit points. Similar 
theorems have been proved [2] without the restriction to non-negative 
os, but we are not concerned with negative z's in this book. When only 
one operator has a unique limit point, that is, an « less than unity, all 
sequences will approach that limit point with probability 1 (see Section 
8.5). 

TRAPPING THEOREM FOR r> 3. For more than two response classes, 
a similar trapping theorem applies to the convex union of the operator 
(The convex union of a set of points is the smallest convex 
imple, when r — 3, the probability 
a plane. If there are three operators 


limit points. 
set containing those points.) For exa 
vectors can be represented by points in 
with distinct limit points, the convex union of those three points is the 
triangle formed by connecting the points with straight lines, and the 
theorem says that all sequences of probability vectors will be trapped in 
this convex union. A similar theorem applies when negative «s are 
allowed [2]. 

ASYMPTOTIC DISTRIBUTION THEOREM. 
experimenter-controlled events, it has bee 
distribution exists, that is, that the p-value distributions on trials 7 
n+ 1 may be made as close to one another as desired by making 7 
sufficiently large. Furthermore, this distribution is independent of the 
initial distribution of p values. For two subject-controlled events, 
Harris has shown [3] that the same theorem is true provided that the æ; 
are non-negative and the absolute value of the difference between the 
two limit points is less than unity. When the limits are zero and unity, 
special cases arise that are discussed in Chapter 7. , 

ERGODIC THEOREM FOR SINGLE SEQUENCES. Whenever an asymptotic 
distribution of p values exists, single sequences of p values possess an 
important property. Such a single sequence traces out à population of 
p values which form a distribution. In the limit as n — 00, this distri- 
bution is almost certainly equivalent to the distribution from all possible 
sequences. In Section 6.3, we develop a computation scheme, based 
upon this theorem, for approximating the form of the asymptotic 


distribution. 


For two response classes and 


n shown that an asymptotic 
and 


100 DISTRIBUTIONS OF RESPONSE PROBABILITIES cH. 4 


4.8 LENGTH OF RUNS 

In this section we are concerned with the expected or average length of 
a run of responses of one kind or another. Suppose we have a long 
sequence of A, and A, responses and we inquire about the average length 
of a run of A, responses or Ay responses or both. For example, if the 
sequence is 4,4,4.4,4;4,4545454545. We observe two runs of Ay 
responses, one of length two and one of length three so that the mean 
length of A, runs is 2.5. Similarly we observe two runs of A, responses, 
one of length one and one of length five, giving a mean length of Ag 
runs of 3.0. 

If the probability p of an A, response is constant it is a simple matter 
to compute the mean or expected length of a run. The solution to this 
problem is well known [4], but we develop it here as an illustration. Let 
there be an A, response on some trial # preceded by an 4, response on 
trial n — 1; then an A, run begins on trial n. The probability that the 
run is of length one is (1 — p) since this is the probability of an 4, response 
on trial + 1. The probability that the run is of length two is p(1 — p). 
that is, the probability that an 4, occurs on trial n + 1 and an Ay 
occurs on trial n + 2. More generally, the probability that the run is of 
length v is p’ (1 — p). For convenience in computations we introduce 
a random variable R, which represents the length of a run of A, responses. 


It has possible values of 1,2,---.9,---, and the probability that Ry 
has the value v is 
(4.63) Pr( Ry = vj = p'-Y(1— p). 


The mean length of an A, run is simply the expected value of Ry: 


E(Ry) = X vPr(R, — v)  N vp" (1 — p) 

(4.64) rol rel 
= 5 wp" N vp". 
»-1 vol 

If we write out a few terms in these sums we quickly see how to evaluate. 
When p < 1 we may re-arrange terms as follows (the series is absolutely 
convergent): 
E(R,) = (1 + 2p + 3p? +--+) — (p 4- 2p? + 3p +--+) 


=I+pt+p+pP+::- 


(4.65) 


For p < 1 this last series is simply the expansion of (1 — p); the reader 
may verify this by long division. Hence 


(4.66) E(R,) = — 


=p 


SEC. 4.8 LENGTH OF RUNS 101 


If p — 1/2 for example. as in a coin-flipping experiment, E(R;) = 2, that 
is, the average length of a run of heads is2. Or, if p = 1/6 is the probability 
of a four in a roll of a die, the expected length of a run of fours is 
E(R,) = (1 — 1/6)! = 1.2. 

When the probability Pn of an A, response on trial 7 changes from trial 
to trial, we can compute the probability of runs of various lengths in 
certain special cases. Suppose again that an 4, response occurs on trial 
n — Land an A, on trial n. The probability that the run is of length one 
is 1—p,44: the probability of a run of length two is p,ai(! — Prr) 
etc, We introduce a random variable R,,, denoting the length of a run 
of A, responses beginning on trial n, and write 

Pr{Ryy = V} = PnsaPasa* © * Pase — Pus? 


r-l 
(4.67) = OT. Pov} = Pus) 
Kel 
r-l r 
= TT Pers — JI Prex 
: (952,8, «). 
We find it convenient to define a quantity Tn, by 
T TT Pe v-bh27te 
K=1 
(4.68) 
Tnt —— l. 


quation 4.67 in the form 


With this definition we can write e 
dfi yum 1,2 tts 


(4.69) Pr{Ri,n = v} = Tns- 
The expected value of R,, is the mean length of run beginning on trial 
n. Itis 

zs 

E(R,,) = SPP Rin = 9) 

v=1 

2 
(4.70) = XvYna— Taw) 

rel 

m o 

= 2 PTa- 2 Piny 

r-l * 

By writing out a few terms we see that 
E(Ry,,) = (Tro + 273 + 3743 ar ) 


(4.71) 


We have assumed that the series is absolutely convergent. 


102 DISTRIBUTIONS OF RESPONSE PROBABILITIES CH. 4 


For any particular sequence of response probabilities, p,, if we can 
compute the products z, , of equation 4.68, we can obtain the distribution 
function and the expected run length. For the case considered previously, 
when p, is constant, Tn» reduces to p^ and E(R,,) = I/(1 — p) as obtained 
before. In addition to the mean run length we may also wish to obtain 


its variance. Thus we first compute the expected value of R2,, and then 
use the formula 


(4.72) eR.) = ECR?) — [ECR, f? 


ln 


We have then 


ERa) = N Pr(R,, = 
r=1 


(4.73) 


Again, we write out a few terms: 
ECR ag) = (ig + Sta + 97,4 + 167,53 +) 
(4.74) — (Tra + 47, + IT,a ee 
= Tuo + Sina + Sms dT 
We may write this as a sum of two series: 
E( RF») = Tho F tay E Treat Tua bee: 
F 2044 FP 4T a 4 67,3 cse 


(4.75) =F tye BS ory, 


r=0 vr=() 


an 


=2 2 VT ny zi E(R, a). 
r=0 
Therefore, the variance is 


(4.76) (Ryn) = 2 Y vr, + E(R,,,) — [E(Ry,,)P. 
r=0 


Thus we see again that, if we can compute the products 7, ,, we can use 


these in obtaining further information about length of runs. For the 
Previously discussed case of 


fixed probabilities, for which Tn» = P^ 
we have 


(4.77) Sae ues 5 


r=0 » 


M 


up’ — p+ 2p H3 


I 


0 


SEC. 4.8 LENGTH OF RUNS 103 


This series is well known, and the reader may easily verify by the binomial 
expansion of (1 — p)? that 


4.7 S uh ; 
nin APU-P 


The variance is then 
2p 1 l P 
479) eR, — à F 
(Rin) (1 — př a (1—p) ( py Genus 


analysis to a more complicated case, that 
discussed in Section 3.3, for which a single operator Q, is applied 
repeatedly. We then obtain pi, y = Q,"**p, for the probability of an 
A, response on trial n + K. We further assume that A;, the limit point 


of Q,, is zero so that 


We now apply the preceding 


(4.80) Pure = t6 po 


Equation 4.68 then gives for the product Tn 


v v T 
(4.81) Bm 


= pa” (0) (°) (x9):::6080) — "50. 


brackets is % raised to a power which is the sum of 


The product in the b. 
It is well known (and easily verified) that 


the first v integers. 


v(v + 1) 
(4.82) EDE EEE E 
Thus 
(4.83) Tu = pore 1)3 v Æ 0. 


The expected run length of equation 4.71 is then 


(4.84) E(R,,) = %" Yap + 1. 


This sum cannot be evaluated except by numerical means, but in Table 


A at the end of the book we give values of the function 
(4.85) Qs, p) = X e * p. 
v=0 


Thus, for our present problem, we have 


(4.86) E(R;,n) aa og" Das, Po) +(1— a"). 


104 DISTRIBUTIONS OF RESPONSE PROBABILITIES CH. 4 


For known values of «; and po, we may readily compute £(; ,) by using 
Table A. 


The variance of the lengths of 4, runs when equation 4.80 gives the 
probabilities is 


a. 


(487) XR) —2X "Po" 7? + ECR, a) — [E(R, f. 
0 


= 
Also in Table A we give the function 
x 


(4.88) F(x, f) = 5 rett, 


r=() 
and so we may write 


(4.89) o?(R;,n) = 22, F (a4, pg) a E(R,,) : [ECR, JP. 


For known values of % and po we can compute this variance by using 
Table A. 

The mean and variance of the distribution of run lengths can be easily 
computed only in certain special cases. We have illustrated the procedure 
when p, is constant and when Pu = *-"pe Unfortunately, most other 
cases are rather complicated and require tedious numerical computations. 
We shall have occasion in Chapters 8 and 11 to use some of the results of 
this section. We summarize these results by the following formulas: 


E(R,,) = 3 


r-0 


na. 


To 7 l. 


For runs of A» instead of 4, responses, we need a random variable 
Ro, representing the length of an A, run beginning on trial s. Its 
expected value and variance are 


(4.90) ER) = S Fy 
r=0 
(4.91) OM Ran) = 2 Y v, + EU.) — [E(R, P, 
$0 1 
where 
(4.92) T= TI cues To = l. 
K=1 


The q’s are the A, response probabilities. 


REFERENCES 105 


4.9 SUMMARY 


Distributions of response probabilities produced by various types of 
event sequences are analyzed in this chapter; recurrence formulas for 
the moments are developed for all cases discussed. For experimenter- 
controlled events, the recurrence formulas for the means and second raw 
moments are solved to yield explicit formulas in Sections 4.3 and 4.4. 
These derivations are restricted to two response classes, but in Section 
4.4 the analysis of the means is extended to any number r of responses. 
For subject-controlled and experimenter-subject-controlled events, dis- 
cussed in Sections 4.5 and 4.6, respectively, recurrence formulas for the 
moments are derived but not solved to give explicit formulas. 

Several theorems about the distributions of p values are stated without 
proof in Section 4.7. These theorems are important in studying the 
properties of the mathematical system but have little direct relevance to 
applications of the system. Finally, in Section 4.8, we discuss the dis- 
tributions of run lengths and derive expressions to be used in Chapters 


8 and 11. 
REFERENCES 


1. Neimark, E. D. Effects of type of non-reinfor 
responses in two verbal conditioning situations. 


cement and number of alternative 
Ph.D. thesis, Indiana University, 


1953. 
2. Bush, R. R., Mosteller, F., and Thompson, G. L. A formal structure for multiple 
choice situations, Decision processes (edited by R. M. Thrall, C. H. Coombs, and 


R. L. Davis), New York: Wiley, 1954, 99-126. 
Harris, T. E. Personal communication and abstract: Annals of math. Stat., 1952 


23, 141. 
Feller, W. An introduction to probability theory and its applications. New York: 


Wiley, 1950, pp. 56-59. 
Karlin, S. Some random walks arising in learning models I. Pacific J. of Math., 


1953, 3, 725-756. 


CHAPTER 5 


The Equal Alpha Condition 


5.4 INTRODUCTION 


In a number of experimental designs, the psychologist introduces a 
certain kind of complementarity between two events. Roughly speaking, 
event £; has the same effect on response A, as event Ey has on response A». 
For example, a subject may be asked to predict which of two light bulbs 
will be illuminated on each trial. The responses are the two possible 
predictions, and the events are the turning on of the light bulbs. We 
would expect that the illumination of the left bulb would influence the 
probability of predicting **left" in the same way as the illumination of the 
right bulb would alter the probability of predicting "right." In other 
words, if the roles of the two events were interchanged and the response 
labels were reversed, the basic design of the experiment would not be 
altered. 

Another example of this kind of symmetry is the Brunswik T-maze 
experiment first described in the Introduction. We would expect that à 
rewarded right turn would have the same effect on p, the probability of 
turning right, as a rewarded left turn would have on q, the probability of 
turning left. Again, "right" and "left" could be interchanged without 
changing the design. Indeed, in many such experiments, half of the 
subjects are trained with one choice “favorable” and the other half with 
the opposite choice "favorable." 

The symmetry or complementarity which exists in the design of experi- 
ments such as those just mentioned should not be confused with position 
preferences or initial tendencies to make one response more than the other. 
These preferences may be described by taking the initial response prob- 
abilities different from 0.5, whereas the symmetry has to do with the event 
operators. 

Within the framework of our general model, we can make the symmetry 
notion described above precise. Consider two responses A, and A», 
with probabilities p and q = 1 — p, and two events E, and E, with 

106 


sec. 5.1 INTRODUCTION 107 
operators Q, and Q, defined by 

Qp = op + q= a)n 

Oop = aap + (1 — «32s. 


It is convenient to replace the second equation by one involving the 
complementary operator @, introduced in Section 1.6. We have 


Oog = 1 — Qop 


= 1 — asp — (1 — &30s. 


(5.1) 


(5.2) 


Replacing p with 1 — q and re-arranging, we get 
(5.3) O24 = agq + (1 — eg — Ay). 


The symmetry requirement is as follows: We want Q;p to be the same 
function of p as Qsq is of q. This is accomplished by letting 


(5.4) ay = tp 
(5.5) | d. 


The first equation says that the slope parameters are equal—this is called 
the equal alpha condition—and the second equation says that the two 
operators Q, and Q, have complementary limit points. 

The symmetric effect of two events in many experiments strongly 
suggests the equal alpha condition as described above, but there is another 
motivation for investigating consequences of this condition. In Sections 
4.5 and 4.6 we encountered some serious mathematical problems which 
become greatly simplified when the equal alpha condition is imposed. 
The recurrence formulas for the moments for subject-controlled and 
experimenter-subject-controlled events could not be solved for general 
explicit formulas; the one exception to this arises when the æ; are equal. 
We consider this major mathematical simplification reason enough for 
studying the model with the equal alpha condition imposed. 

With experimenter-controlled events, the equal alpha condition leads 
to only minor algebraic simplifications; in Sections 4.3 and 4.4 we obtained 
explicit formulas for the means and variances without using the equal 
alpha restriction. Nevertheless, we examine the consequences of the 
equal alpha condition for experimenter-controlled events, mainly because 
the results are used in Chapter 13 to analyze some data. 

As mentioned above, the main mathematical simplifications result from 
imposing the equal alpha condition when the events are subject controlled 
or experimenter-subject controlled. However, in the analysis of experi- 
ments for which those cases seem most appropriate, the equal alpha 


108 THE EQUAL ALPHA CONDITION CH. 5 


condition seems least appropriate, as we shall see in Chapter 13. The 
equal alpha condition implies that reward and non-reward have "equal 
but opposite" effects on behavior; such an assumption is seldom war- 
ranted, either by psychological theory or by data. In spite of such 
objections, the equal alpha assumption leads to a useful “base-line” 
model for many experiments, that is, the data can be compared with this 
base-line model in much the same way that we compare results with a 
null hypothesis in routine statistical analyses. 


5.2 IMPLICATIONS IN THE SET-THEORETIC MODEL 


We now examine what the equal alpha condition means in terms of the 
theory of stimulus conditioning presented in Chapter 2. In Section 2.4 
we showed that the parameters a and > in the operators for two response 
classes could be considered as the relative measures of two subsamples, 
A and B, of stimuli from the sample ¥. After a response occurred, the 
subsample A became conditioned to response A,, and the subsample B 
became conditioned to response Ay. By definition, we have 


(5.6) «i; = 1 — a; — bp f= 1,2 


and so we see that the condition % = & is equivalent to 
(5.7) 


a, + b, = as + by. 


For two events, E, and Ey, we must distinguish between a sample X; and 
its subsamples 4, and B, which correspond to event £j, and a sample X» 


with its subsamples A, and B, which correspond to event Ey. The equal 


alpha restriction is then equivalent to 
(5.8) MU) + (By) = MIA) + MUS), 


or, since A, and B, are disjunct, 


(5.9) MA, U By) = MAS By), 


where the symbol U indicates the set sum or union. 
set A, U B, is the total set of stimuli available for reconditioning when 
event E, occurs. Similarly, Ay U By is the set available for reconditioning 
when event E, occurs. Equation 5.9 shows that the measure of the 
elements available for reconditioning is the same for both events. 


Now the stimulus 


5.3 EXPERIMENTER-CONTROLLED EVENTS 
For several experiments discussed in Chapter 13, we would be willing 
to identify the events in the model with certain stimulus changes which 
are scheduled in advance by the experimenter. For example, a pre- 
arranged sequence of appearances and non-appearances of a light is used 


SEG. 5.3 EXPERIMENTER-CONTROLLED EVENTS 109 


in some such experiments. Asa result, the case of experimenter-controlled 
events is applicable. When we further assume that a, = & = « for 
problems involving two events, the computation of the means and 
variances of the response probability distributions is quite simple. In 
this section we develop the necessary formulas for such computations. 

In Section 4.3 we derive explicit formulas for the mean and variance of 
the p-value distributions for two experimenter-controlled events. When 
&, = xa = %, equations 4.15, 4.16, and 4.17 give the 


EXPLICIT FORMULA FOR THE MEANS: 


(5.10) Van = Vie — Vie — Vo)", 


and the 
ASYMPTOTIC FORMULA FOR THE MEANS: 
(5.11) Vi» = TA + Toto 


as for the second raw moments, equations 4.22 and 4.23, are 


The formul 
Furthermore, the variances, defined by 


also simplified when = «3 = 2- 


2 


(5.12) g,* = Von 


E 
= lat 


can be computed from the following simple formulas which result when 


Oy = X = X 
EXPLICIT FORMULA FOR THE VARIANCES: 


2n 


(5.13) ga — (Ca? — a), 


n 
ASYMPTOTIC FORMULA FOR THE VARIANCE: 

l—«¢@ 
"de 


[(m A3 + 7242") — Vi. 


(5.14) og? = 


we see in Chapters 7 and 13, we are also 


For many applications, as 
—0. Then we have 


willing to assume that À = Land Ay = 


(5.15) Vy = Tj 
l—« 
(5.16) vat = EF asl m). 


ns shows that the asymptotic mean prob- 
he probability of occurrence of event £4. 
arge class of experimental data 


The first of this pair of equatio 
ability of response 4, is equal to t 
This conclusion is consistent with 
described in Chapter 13. 

similar formulas are obtained 


When there are more than two events. 
as indicated by the analysis in Section 4.4. For two response classes and 


a rather | 


110 THE EQUAL ALPHA CONDITION CH. 5 


t experimenter-controlled events with g; = « (i= 1,2, ---, £), the fore- 


going explicit formulas for the means and variances are still correct, but 
the asymptotic formulas become 


(5.17) Vac Xe 


f= A 
(5.18) ies Z (> nae = its. 


When we have more than two response classes (and ¢ events) we can 
compute a vector of marginal means, as shown in Section 4.4. These 
vectors V, , are given by 
(5.19) Vin = Vis — (Va — Vig, 


t 
(5.20) Maus = È mAn 
ia 
where A, is the limit vector corresponding to event Ej. In Chapter 13 we 


. use these results for analyzing data obtained by Neimark in a three-choice 
situation. 


54 THE DISTRIBUTION MEANS FOR TWO 


SUBJECT-CONTROLLED EVENTS 
When the events are subject controll 


ed, that is, when Q; is applied to pa 
with probability p, and Q, is 


applied with probability 1 — Pn, explicit 
formulas for the mean and variance were not obtained in Chapter 4. 
However, when x, = «4 = a, major simplifications obtain, and so we now 
derive the desired explicit formulas. 


The recurrence formula for the means is given by equations 4.52 and 
4.53: 


(5.21) ALIM (a, — a, 4- &9)V, , + (x4 — &g) Vs n 


This equation cannot be solved to obtain an explicit formula for V, in 


terms of Vio ay, as, 24, and %» because of the term in Vy, However, 
when we let % = % = &, this term disappears, and we have the 


RECURRENCE FORMULA FOR THE MEANS: 


(5.22) Vinti = a + (ay — a + A) Vin 


This is a linear difference equation of the type we first encountered in 
Section 3.3, and it can be solved immediately. However, it is instructive 
to note that the right side of this equation is given by a new operator Ø 
applied to V, ,, that is, 


(5.23) QV,, = as + (a, — as + &)V,,. 


SEC. 5.4 MEANS FOR TWO SUBJECT-CONTROLLED EVENTS LII 


This operator Q has the same form as the operators Q; when they are 
written in the slope-intercept form 


(5.24) Q;p = a; + ap. 
The operand is V} „ instead of p, the intercept is as instead of a;, and the 


slope is (a, — as + «) instead of æ; but mathematically Ø has the same 
form as Q, and Q,. 

The operator Q may be obtained in another way as already suggested - 
by the analysis in Section 3.12. Suppose we were to compute a weighted 
average of the operations Q,V,,, and Q;V,, with weights V,,, and 
1 — Vim respectively. We may easily verify that this will yield the 
expression on the right side of equation 5.23: 

Vini Via Er (1 P V, QV; 
(5.25) = V (ay 3 «V, ,) 3-0 = Vi wae Sis av, +) 
= ag + (a, — ag + x)Vis, 
and so we have 
(5.26) OV, = Vol. + Gh Vin) Qo Vim 
This means that © can be considered a weighted mean or expectation of 


the operators Q, and Q, and so we call O the expected operator. —— 
The mean V; „ can be obtained by applying the expected operator Q to 


the initial mean z times: 

(5.27) Vis = QUE o. 

The results of Section 3.3 may be used directly if we first write Q in its 
fixed point form. We will call the slope f), that is, 

(5.28) p=a,—a,+4, 

and we will denote the fixed point by Vi,» Therefore the fixed point 
form of Ó is 

(5.29) QV, = BV 0 — Fio 

Comparison of this result with the slope-intercept form of Ó, equation 


5.23, shows that 
(1 — B)Vi, o = dg. 


Thus we have, for f Æ 1, the 


ASYMPTOTIC FORMULA FOR THE MEAN: 


d» d» 


BuU) uae url ami [e 


112 THE EQUAL ALPHA CONDITION CH. 5 


Moreover, from equation 3.9 we see that Q may be applied n times to 
V, , to give the 

EXPLICIT FORMULA FOR THE MEANS: 

(5.31) Vin = Qus =B Vra + BV. 

This is the desired equation for the distribution means. 

Before we develop the corresponding explicit formula for the second 
raw moments we wish to point out a rather interesting property of the 
means, namely, that they behave as if they were the means associated with 
a simple two-state Markov chain with constant conditional probabilities. 


Suppose we introduce the parameter b, of the gain-loss form of Qy. It is 
defined by equation 1.17, that is, 


(5.32) b,—1-—a—a. 
Substituting this in recurrence formula 5.22, we have 
(5.33) Kiman = da + (1 — ag — 5)M,,. 


We note that the parameter x has been eliminated from this equation and 
only the parameters ay and b, remain; that is to say, the means behave 
the same way no matter what value x has. It is instructive to consider 
O, which is applied tog = 1 — p. We have 


(5.34) Õıq = 1 — Qip = 1 — a, — al — q); 
or, using 5.32, we have 


(5.35) Qiq = b, + aq. 
The operator Q, may be written 
(5.36) Qp = as + xp. 


We then note that Q4 is the conditional probability Pr(AgdA;). and Qep 
is the conditional probability Pr{4,|42}. As pointed out in Section 
3.12, if we take x = 0 these two conditional probabilities are constant: 


Pridi} = bis = 0), 
(5.37) i jl 13 = b (x ) 


Pr{A\|Ag} = ag, (x — 0), 


and we have a simple two-state Markov chain. The analogue of equation 
3.61 is then 


(5.38) Vins = a3 + (1 — ag— b)V, n 


but this is precisely the result we had in equation 5.33 without the restriction 
thata =0. This is not surprising because we saw in equation 5.33 that 
the means V, , do not depend on x; the means are the same for « = 0 as 
for any other value of x. Thus we conclude that the distribution means 


SEC. 5.5 VARIANCES FOR TWO SUBJECT-CONTROLLED EVENTS 113 


for the equal alpha case behave as if we had « = 0 and hence as if we had 
the constant conditional probabilities of equations 5.37. In Fig. 5.1 we 
illustrate how the means vary with trial number for b, — d$ = 0.1. 


Vivo 


05 


8 10 


0 2 4 6 
Trials, n 


Fig. 5.1. Showing the curve of mean probability, Vin Vs. 7 for ten trials, 
and the standard deviation, fn- Equations 5.31 and 5.41 were used with a, — 0.3, 
a, = 0.1, x = 0.6, Vi a = 0.1, % = 0. 


nt for computing the means, in the next 


Although the value of « is irrelevar 
s of the distributions do depend on «. 


section we show that the variance: 


5.5 THE DISTRIBUTION VARIANCES FOR TWO 
SUBJECT-CONTROLLED EVENTS 
tribution higher than 


It is often useful to compute moments of the dis 
Equation 4.46 for the 


the first, and this can be done when a, = % = *- 
mth raw moment on trial n + 1 becomes 


m 


(5.39) V, = >t) (a m tV H aa a Van — Voas: 


u 
u=0 


In this sum we may separate out the term 
(5.40) Finnit 


s for u = m and obtain 


m-—1 


j m 
H t (ath Vosa d ag” "a"( Van — Va Toe Vae 
u=0 


114 THE EQUAL ALPHA CONDITION CH. 5 


We now note that the highest moment on the right side is V, ,, and so we 
express Vmn in terms of Von = 1, V, ss, V,,. These equations 
may be solved explicitly for m = 1, 2, + - - to yield expressions for the raw 
moments in terms of the trial number 7 and the parameters. 

We solved for the means V, , in the preceding section so we now proceed 


to solve for the second raw moments. Form = 2 we have, from equation 
5.40, the 


RECURRENCE FORMULA FOR THE SECOND MOMENTS: 
Vany = ai His +a — Fia 
(5.41) 2a, ay Vi, — Vond} + os, 
= ay + (a? — a + 2asx)V,., + (2a, — 2a, + a)Vo s. 
We then introduce the abbreviations 
B, = aj? — a? + 2a, 


By = a(2a, — 2a, + x), 


(5.42) 


so that we may write 


(5.43) Vo nti = ay? + BWin + ByV. 


From the preceding section we know that 


Qn 


(5.44) VY = Pis = U^ s um V, p". 
When we substitute this in the above expression for V, ,,, we have 
(5.45) Vont = (as? + B, Po) T By Fu ES V, op" + BV on 


We solved a difference equation of precisely this form in Section 4.3 


(cf. equations 4.18 and 4.22), and so we can write down the solution 
immediately. We then get the 


EXPLICIT FORMULA FOR THE SECOND MOMENTS: 


1 — Bj p"— Bi 
649 Van = (a? + BV, =) Tx TUN E T-E 


+ Bg V. 
where Vio and V5 are the initial mean and second raw moment, respect- 
ively When =I < Be 1,—1 <p <1, and BA By, we have in the 
limit as n — co the 


ASYMPTOTIC FORMULA FOR THE SECOND MOMENT: 


2 
(547) NE AT Biia 


SEC. 5.6 SUBJECT-CONTROLLED EVENTS WITH A, = 1 AND 4, = 0 115 


where V4 , is given by equation 5.30. Since B, and B, are simple algebraic 
functions of a}, as, and %, we have expressed V s in terms of those para- 
meters. The asymptotic variance is obtained from the relation o,,2 = 
Vy ,, — Vi, and so we have from equation 5.47 
aè + BV y2 

= Bs 1,o* 


In Fig. 5.1 we illustrate how ø, varies with n. 


2,0 


2 


(5.48) o, 


5.6 SUBJECT-CONTROLLED EVENTS WITH 

Àj = 1 AND 2,=0 
al applications we would be willing to assume that 
a response would increase the likelihood of 
ost certain to occur. In mathe- 


In many experiment 
the repeated performance of 
that response occurring until it was alm 
matical language, this means that the limit point A, is unity if the response 
considered is identified with alternative A, In other words a repeated 
application of Q, to p would tend to make the probability of A, unity. 
We might make the same assumption about response Ag; if Ay occurred 
repeatedly its probability q = 1 — p would tend to unity and so p would 
tend to zero, implying that 4; = 0. In Chapter 7 we consider some of the 
mathematical properties of the case of 4, = 1 and A, = 0, but in this section 
we consider these conditions when we also have o = x, The two 
operators are then 
Qip = ap + (1 — a), 

Qsp = op. 
We see that only the single parameter x is involved in the 


so the equations should be especially simple. . 
A possible application of the operators of equations 5.49 is the Brunswik 


T-maze with 100 percent reinforcement on both sides of the maze. (In 
the Introduction and in Section 3.9 we have discussed the Brunswik 
experiment.) If the rat is rewarded on the right side, p (the probability of 
turning right) should increase towards unity. If the rat is rewarded 
for going left, this should decrease p towards zero. Moreover, since the 
situation is symmetric we would expect to have a,— «, Thus the 


foregoing operators might be quite adequate for describing the data. 
From the relation a; = (1 — a); we see that 4, = 1 implies that 


a, = | — « and that 4 = 0 implies that a, = 0. Hence the recurrence 
formula for the means, equation 5.22, becomes 


(5.49) 
se operators and 


(5.50) Vani = Vino 
and so we conclude that 
(5.51) Vin = Vio (n=0,1, DD, 


116 THE EQUAL ALPHA CONDITION cH. S 


In other words, the mean is constant from trial to trial and is equal to 
the initial mean. This result can be obtained also from the explicit 
formula 5.31, since a, = 1 — z and a, = 0 imply that jj = 1. 

The formula for the second raw moment can be obtained directly from 


equation 5.46. We have a, = 0 and # = 1, and from definitions 5.42 we 
get B, = (1 — x) and B, = «(2 — a). Thus 
(5.52) Von =(1- By" )Viy Y B, Va o. 


For 0 < x < I we see that B, = «(2 — x) isin the range 0 < B, < 1 and 
so with this condition B," tends to zero as n — æ. Therefore 


(5.53) Vow =Vig (0l) 


The variance is of course 
(5.54) On = Ve, — VP 


and so we have 


(5.55) = Vio — Vis 
= Via Gy Vio). 


The fact that the asymptotic variance has this form suggests that the 
asymptotic distribution is binomial with all its density at zero and unity. 
We see in Chapter 7 that this is indeed true whenever A, = 1 and A, = 0. 


5.7 SUBJECT-CONTROLLED EVENTS WITH 
4j — 0 AND 2, = 1 
The preceding discussion of the conditions 2, = land A, = 0 suggests 
the opposite case of 4, = 0 and 4, — 1. Whereas the previous case 
might correspond to equal rewards of two similar responses, the present 
case might correspond to extinction of two similar responses. If a rat 
in a Brunswik T-maze were to find no food on the right side, p might tend 
to zero; finding no food on the left side might tend to make p go to unity. 


This situation would require that 2, = 0 and 2,= 1. If in addition 
%1 = % = x, we have the operators 


(5.56) Pup a 
Oop = ep + (1 — a). 
We now develop expressions for the 
this pair of operators. 
From the relation a, = (1 — 7); we see that 2, = O and 2, = 1 imply 
that a, = 0 and a, = 1 — g. From definition 5.28 it follows that 
(5.57) B — 2x — 1. 


asymptotic mean and variance for 


SEC. 5.7 SUBJECT-CONTROLLED EVENTS WITH A, = 0 AND 25 = | 117 


The explicit formula for the means is equation 5.31 above, with the 
preceding value of p, but from equation 5.30 we see that the asymptotic 


mean is 


(5.58) Ys 


ay ]—4 


i Tege 


Nie 


The asymptotic mean is 1/2 no matter what value of x or V, is involved. 
We next study other properties of the asymptotic distribution by looking 


at higher moments. 
d raw moment is obtained directly from equation 


The asymptotic secon 
5.47. As before, 22 = 1 implies that a, = 1 — «, and from definitions 


5.42 we get, since a, = 0, 
B,—-—(—2a*-4 XI xa = (1 — 2)(3% — 1), 


uni By = a[-2(1 — 2) + o] = ax — 2). 
With these equations we get from equations 5.47 and 5.30, 
1 a, 0 — 93 — 1) 
(5.60) | Vas = T3 3H (1 — «Y + D-—Ds \. 
Algebraic simplifications lead to the result 
I+a 
(5.61) Vae =F 4 3a) 
The asymptotic variance is then 
Gg = Vow = Vw 
(5.62) od ur 
7200339 WM. 
Minor simplifications then yield 
m I 
(5.63) i= METAI 


, "E ; 
For example, if « = 0 we get the largest possible variance oe? = 0.25; 


if a = 0.5 we get o, ? = 0.05. 

It may be instructive to consider one mor 
distribution, the third moment which has to do with skewness. 
shown that the third moment about the mean is zero, which suggests 
that the asymptotic distribution is symmetric about the mean V, = 0.5. 
The third raw moment of the asymptotic distribution turns out to be 


e moment of the asymptotic 
It can be 


1 
(5.64) Vs,0 = 3 F 3a) 


118 THE EQUAL ALPHA CONDITION CH. 5 


Further analysis shows that all the odd moments about the mean are zero, 
and so the asymptotic distribution is in fact symmetric. In Chapter 6 we 


illustrate this symmetry by displaying an approximation to the entire 
asymptotic distribution. 


5.8 EXPERIMENTER-SUBJECT-CONTROLLED EVENTS 

We now consider the problem discussed in Section 3.13 and 4.6 with the 
equal alpha restriction. We have two alternatives 4, and Ay, with prob- 
abilities p, and 1—p, on trial n as in the preceding sections of this chapter. 
Each alternative may have outcome O, or Os; the probability of O, 
following A, is z,, and the probability of O, following Ay is ma The 
Operator applied to p, when A; and O, occur on trial 1 is Q;. From 
equations 4.56 we have 
(5.65) Qin p = «p + (L — eg. 
Since there are four parameters g; we have a choice between setting all 
four of these equal to one another, or imposing a less severe restriction. 
We first consider the consequences of the assumption that «;, is independent 
of which response is made but does depend upon the outcome, that is, 
we let 


(5.66) 


Wig = Cop = Ap kasha 
These conditions lead to the Operators given by 
Qup = apt (1 — ag [p7], 
(5.67) Qis p — asp 4- (1— o)ra [p(1 — m), 
Qan pP — « p + (1 — «943 [U — pyz3l, 
Q»P—9*p-(-—ay [0— PI — 73). 


The probability of applying each operator is shown in square brackets 
after each of the preceding equations. (For simplicity we have let 
7,4 = 7; and gz,, — ] — 7;forj— 1, 2) The mean parameters defined 
by equation 4.61 of Chapter 4 become 


ā = m (l — «043 + (1 — TIN — &3)445, 
(5.68) dy = 7(1 — 93)Àg + (1 — T3)(1 — 5)Ao5, 


a= 7,94 + (1 — 71)X5, 


hs = Tı + (1 — T3). 
From equation 4.62 we then get the recurrence formula 


(5.69) Vania = à; + (àj — à; + Zo) Vi n + (m, — ma) — ta) Va n 


SEC. 5.9 LIMITS ZERO AND UNITY 119 


The annoying feature of equation 4.62—the presence of the term in 
V, ,—has not been eliminated, unless either % = 4 Or 7 = 75. 


When we take o, = % = a, we then have the operators 


Qu p — «p 4- (1 — 90/1 [pm]. 
(5.70) Qia p — op + (1 — Aye [pl — 70]. 
On p ap-t(-— 2) [0 — pr; 


Qoa p = ap + (1 — Hage [(1 — pX1 — 73). 


The probability of application of each of these operators is shown in 
square brackets on the right of each. In the last of equations 5.68 we 


see that, when v, = % = 9, 


(5,71) Go = a, 
and so equation 5.69 becomes 
(5.72) V, ia = da + (d — da + Vin 


This result generalizes immediately to s possible outcomes O, with con- 


ditional probabilities z,,, provided that we re-define 
s 
d, = > Tiri 
pi 


(5.73) 


s 
d, = X Tad 
k=1 
and recall the condition that 
(5.74) > 7, = 1 (j= 1, 2). 
k=1 
Equation 5.72 can be solved immediately for Vin: 
(5.75) Vin = Vio — (ie — Vio)y"s 
where 
d» 
(5.76) Vio = Tey , 
and 
(5.77) y = (à, — à, + &). 


Higher moments of the distributions can be computed in a similar manner. 


59 EXPERIMENTER-SUBJECT-CONTROLLED EVENTS 
WITH LIMITS ZERO AND UNITY 
with reward and outcome OQ, with 


When we identify outcome O; 
1 and 2,2 = Ag, = 0. 


non-reward, we are often willing to let 4; = Ža 


120 THE EQUAL ALPHA CONDITION CH. 5 


The operators then are given by 


Qup-— % p+ (1 — 24) 
Qis p = zap 
(5.78) : 
Qnp-—9p 
Qsa p = zs p + (1— Zo). 
The recurrence formula for the mean becomes 


(5.79) y, 


1,n+1 
= (1 — z3X1 =E [1 — 2(1 — 7)(1 — a) 
+ (7, — 7,)(1 — 24)]A,., + (mi — Toy — x3) Vo s 


This fermula again Suggests two special cases: 


7T — m, and 
9,— *$ — 4. For the equal z case, we have 


(5.80) Ving = (0 — m1 3) + [1 — 2(1 — z)(1 — 23)] Vi. 
This has the solution 


$80 Noe UL. PUT em adi 


where 


(5.82) Vig = 12; 


For a) = % = y, recurrence formula 5.79 becomes 


(5.83) Vana = (1 — 71 — a) + [I — (2 — Ti — 73/1 — a) Vin 
and its solution is 


649) Vin=ia = (s Vi dl — (2 — m, — v1 — a)]" 
where 


(5.85) 1 iy Set 


de — 


2 — m- To 


The parameter does not appear in the expression for the asymptotic 
mean, which can be computed from the reward probabilities z, and 72 
in advance of collecting data. For example, if we reward 4, on 50 
percent of the trials it occurs and reward 4, on 10 percent of the trials it 
occurs, this result says that, for a group of subjects, A, should occur 
about 64 percent of the time after learning is complete. (It is worth 
noting that this is a good way from 83 percent, a result obtained by 
simple proportions.) In Chapter 13 we use this result in examining 
Several sets of data from two-choice experiments, 


SEC. 5.10 EXTENSION TO r RESPONSES AND $ OUTCOMES 121 


*5.10 EXTENSION TO r RESPONSES AND s OUTCOMES 


In this section we show how the analysis of the preceding sections can 
be generalized to any number r of response classes and any number s of 
outcomes. We use the matrix operators of Section 1.8; when response 


A; and outcome O, occur we apply the operator, 
(5.86) T4 = ejl + (1 — x5)Aj. 


to the vector p, to obtain the vector p,,,. We impose the equal alpha 
restriction 


(5.87) ma (fel errr k=1,2, s), 


and so we have 
(5.88) Tj, — «I + (1 — «Ay. 


The probability of applying this operator to the vector p, on trial 1 is 
Pn, jTins Where pp; is the jth component of p, and z;, is the conditional 
probability of outcome Op, given alternative A; occurred. 

As in Section 4.4 we are involved in a multivariate distribution of p 
values on each trial, but we restrict our attention to the marginal means. 
We define a vector of marginal means 

(rs)" 


(5.89) Vin = S PoP 


vr=1 
Following the procedure of Section 4.4, we have for the next trial the 


vector 


m m a 


(5.90) Vin = > PAS X Pag ieu 
vel j=l k=l 
Using 
(5.91) Tjp,, = 2p, + (1 AYA jis 
and the relations 
ml YHA, 
1 
(5.92) . 
S Pang = 1, 
j=l 
we have 
(rsy" r s 
(5.93) Vima =>, Pulepa FO — 99.5 X prn aTi 
v-1 j=ik=l 
We then define an average vector i, by 
(5.94) yp = Sle 


kel 


122 THE EQUAL ALPHA CONDITION CH. 5 


Now consider the sum 
roc r H 
(5.95) £X XEpagtaAa = D Pon shy: 
j=1k=1 j=i 
We show that this sum can be obtained by applying a new matrix A to 
the vector p,,. Let A be formed by the r vectors Àj with A; in the jth 
column, that is, if 


Lm 
hes | 
(5.96) uw * |, | 
Ans 
then B 
A, 1 Aes i, 1 
D D An l 
(5.97) Ass S 
A, r Js. rM 


When we apply this matrix to P,» we have for the uth element of A Pyne 
the sum 


E. 
(5.98) 2 Aju Pons 
- 
Hence we have 
(5.99) Ap, = X Panis. 


Using this result in equation 5.93 gives 

(sy 
(5.100) Vine = > Pon (Pon + (1 —2) Ap,,}. 
Using the definition of V;,, we then get 
(5.101) View = Via + (1 —2) A V, 
Therefore we define an expected operator T by 


(5.102) T=el+(1—a)A, 


SEC. 5.10 EXTENSION TO r RESPONSES AND $ OUTCOMES 123 


so that 
(5.103) Via = Tie 

The fixed point of the operator T will be the vector of asymptotic 
marginal means V; ,. We solve for this by setting 


(5.104) Vio = TV; o = 4V1 o + (1 — 2) Å Vio- 
Hence 
(5.105) A Va Vive 


This vector equation can be solved to yield the jth component V, «,; 
of the vector V; ,. We have the set of equations 


(5.106) > Asi Visa = Vio (w= 1,2, 5*7) 
i=l 

and the necessary condition 

(5.107) 2 Ves 


ucl 
These equations can be solved for the asymptotic marginal means. 

We illustrate the solution of the preceding equations for the simpler 
case of two outcomes O; and Oy. The example to be given would be 
appropriate if endlessly repeated reinforcement of a response made it 
certain to occur, whereas endlessly repeated extinction of a response 
(were this possible) would eliminate it, but make all other responses 


equally likely. Accordingly, we let 
. | l foru = j 


Anu 


0 foru + j, 
(5.108) 
(0 foru — j 
Aion = 1 
me | for u ze j: 
r—1 
Then we get 
mj for u — j 
(5.109) Ayo 3 m 
a jq sk. 
a oruzJ 


Equation 5.106 then gives 


(5.110) LE — x — nies = View 
es 


jeu 


124 THE EQUAL ALPHA CONDITION eH. 5 
Re-arrangements then give 
r 
: l , 
(5.111) Um Men=> > TV ei 
r £7 
j=l 
The right side of this equation is a constant which we may call C and so 
C 


l — Fai 


(5.112) a 


When we impose condition 5.107 we then can evaluate C and get finally 


(5.113) E 


l 
2 
zl Ty 


Hence we obtain the asymptotic marginal means in terms of the con- 
ditional probabilities 7. 


*5.11 MARKOV SEQUENCES OF EXPERIMENTER- 
CONTROLLED EVENTS 


In Section 3.11 it was pointed out that a Markov chain had been used 
for constructing a schedule of events in at least one experimental study. 
Hake and Hyman [1] had subjects predict on each of 240 trials which of 
two symbols would appear. The sequences of the symbols were prepared 
in advance and so were independent of the responses made by the subjects. 
In two of the four groups of subjects, the sequences of symbols were 
Markov chains. 

Formulas for the moments of the distributions of response probabilities 
were not developed in Chapter 4 for a Markov sequence of operators. 
The analysis is quite complicated and the results unsightly except when 
the equal-alpha restriction is imposed. However, this restriction appears 
appropriate when we are describing experiments such as the one of Hake 
and Hyman: the events—the appearance of one symbol or the other— 
have an intrinsic symmetry. If the roles of the two symbols were reversed, 
the data should be the same except for relabeling of the two responses— the 
prediction of one symbol or the other. Therefore, in this section, We 
compute the trial by trial means of the distributions of response prob- 
abilities for this problem. 

The two event operators are 


Op = ap + (1 fy 


(5.114) : 
Qs p — sp + (1 — aay. 


SEC. 5.11 MARKOV SEQUENCES OF EVENTS 125 


These operators are applied according to the rules of a Markov chain. 
If Q, has just been applied, it will be applied again with probability z(1| 1), 
whereas if Q, has just been applied, Q, will be applied with probability 
-(1|2). The two conditional probabilities of applying Q, are then 


2Q|1) = 1 — z(1|D. 

(5.115) PM «n 
2(2]2) = 1 — z(1 [2 

Hence the Markov chain is specified by these conditional probabilities 
and an initial probability zy(1) of applying Q, on trial 0. The probability 
of applying Q, on trial 0 is 
(5.116) m(2) = 1 — m(1)- 
f the operators we wish to compute the 
f response probabilities. 
ave several possible sequences 
possible sequences up through 


With these rules of application o 
means V, „ of the resulting distributions o 
On any trial beyond the initial one we h 
of events up through that trial; there are 2" 
trial. Half of these sequences terminate in event E, and half in event £s. 
abilities of events Ej and Ex on trial n + 1 depend upon the 
n, and so we must distinguish 
Let the sequences terminating in 
E, on trial n be labeled » = 1. 2, -+ 2-1, and those ending in an E, on 
Furthermore, denote the prob- 


trial n be labeled xe = 12,775 2^ 
bility of the rth sequence ending in E, by P,,,. and the response probability 


associated with this sequence by Psn Similarly, let the probability of the 
uth sequence ending in Ey be P,,, and the associated response probability 


be p,,. 
The unconditional probability of event E, on trial n of a Markov chain 
was computed in Section 3.11, but in the notation just introduced it is 


The prob 
event which actually occurred on trial 
between the two classes of sequences. 


(5.117) z, (0) TM 
Similarly, the unconditional probability of E, on trial n is 
(5.118) 7,(2) =S Pye 
ite 
and we have of course 
(5.119) m (D + m2 = i 


The mean response probability on trial 7 is given by 


(5.120) Vay = Pan Pen + XP Paw 
" Li 


126 THE EQUAL ALPHA CONDITION CH. 5 


On trial n + 1 the mean will be 

Viana XP, [ntl] 1)0; p,, + (2| 1)0sp,,] 
(5.121) t dd 
+ EPn UDO, Pun + |D; p,,,). 


n 
We then insert the expressions for Q, p and Q, p given above: 
Vinti = ÈP pfa | Diep, + (1 — 2)4] + 7Q|D[zp., + (1 — 2)24]) 
(122) + XP, lC MP yn +A — 2)21] + (2| Diep, + (1 — a)l}: 
7 
Using relations 5.115, we get after simplifications: 
Visa = EPn Pon + — od fl] 02, + 2(2|D43) 
(5.123) z 
E EPn Pun + (0 — ays(1|2)2s + 2(2[2)24]). 


n 


We then use the definitions 5.117, 5.118, and 5.120 and get 
Kaw = Oe sU — 2) tm, OL | a, + (2 1)24] 
(5.124) Intl 1, t | 1 | 


+ m Q)(1[2)2, + 7(2|2)2,]). 
This formula, which is a Tecurrence formula in the means, can be solved 
to give an explicit formula. 
From the analysis in Section 3.11, we obt 


(5.125)  m,(1) = Tall) — [r 


z(1|2) 
z(1|2) + z(2|1)` 


These relations can be inserted in the foregoing recurrence formula for . 
Vin and the resulting difference equation solved. We do not carry this t 
out here, but impose an additional restriction, namely, we take m«(1) = 

Tal) This simply means that the initial probability of event Eis chosen ' 
to match the asymptotic Proportion; this was done in the Hake-Hyman | 
experiment, for example. When this is true, 7,(1) = 7, (1) and equation 
5.124 simplifies to 


ain the expressions 


«(1) — z [1 — 7(2|D — «(1 |2)", 


(5.126) «,(1) = 


(5.127) Vins =a, + — e), 12, + IL rs, 
In the limit when  — oo, we have Vini = Vj, = Fia and 


(5.128) Visco = Tolli + [1 — v (4. 


Finally, if we further assume that 2, = 1 and 2, = 0, assumptions which | 


REFERENCES 127 
seem appropriate for the Hake-Hyman experiment, we get 
(5.129) Vio = Tall). 


This result agrees with the experimental finding of Hake and Hyman: 
The asymptotic frequency of predictions of one symbol is equal to the 
frequency of occurrence of that symbol in the sequence. 


5.12 SUMMARY: 


_ The equal alpha condition examined in this chapter is of particular 
interest in analyzing data from choice experiments that involve similar 
alternative responses. In Chapter 13 several such experiments are 
discussed, and the machinery of this chapter applied. When the events 
are subject controlled or experimenter-subject controlled, the equal alpha 
condition leads to some major mathematical simplifications. The 
recurrence formulas for the moments can be solved explicitly. 

The three types of event sequences considered in the previous two 
chapters—experimenter-controlled, subject-controlled, and experimenter- 


subject-controlled—are re-examined in this chapter with the equal alpha 
he means and variances are derived. The 
onse classes and outcomes 
menter-controlled events 
are developed. 


condition. Formulas for tl 
analysis is extended to arbitrary numbers of respi 
in Section 5.10. A Markov sequence of experi 
is considered in Section 5.11, and formulas for the means 


REFERENCE 
1. Hake, H. W., and Hyman, R. Perception of the statistical structure of a random 
series of binary symbols. J. exp. Psychol., 1953, 45, 64-74. 


CHAPTER 6 
Approximate Methods 


6.1 INTRODUCTION 


In Chapter 4 we studied the distributions of p values when event 
occurrences were uncertain. Our main interest was in the moments of 
those distributions, especially the mean and variance. When events 
were experimenter controlled, that is, when the event probabilities were 
fixed, there was no difficulty in computing the mean and variance of the 
distribution from trial to trial; but when the events were subject controlled 
or experimenter-subject controlled, difficulties arose. Only with equal 
alphas, discussed in the last chapter, were we able to compute the moments 
exactly without expending a great amount of labor. Asa result, we now 
turn to some approximate methods for computing trial-to-trial means for 
subject-controlled and experimenter-subject-controlled events without 
equal alphas. 

Two main types of approximate methods are developed. The first 
general method involves a random number scheme for studying how 
individual model organisms behave. This method can be applied to 
many organisms, thereby giving estimates of means and variances. 
Furthermore, it may provide some insight into the kind of sequential 
behavior which is implied by the model. In Section 6.2 we introduce 
this method, and in Section 6.3 we apply it to a study of the asymptotic 
distributions. The second general method provides upper and lower 
bounds on the distribution means. This method leads to a statement 
that a mean is between two numbers that are readily computed from the 
formulas developed. When the upper and lower bounds are close to- 
gether, we may know all that we need to know about the mean. An 
example discussed in Section 6.8 gives bounds of 0.676 and 0.655. We 
are ordinarily quite satisfied with bounds as "tight" as this, but we are 
not always so fortunate. Table 6.3 shows several sets of bounds; some 
sets are so far apart that they are of little value, whereas others 
are quite close together. 


128 


| 
| 


SEC. 6.2 STAT-RATS 129 


6.2 STAT-RATS 


One method of obtaining approximate estimates of means, moments, 
and percentage points of the probability value distribution uses random 
are given the constants for the operators and a starting 
form, we can mechanically carry a hypothetical 
f trials—noting which operator is to be 
applied on each trial, applying it, and keeping track of the p value at each 
stage. We have christened such hypothetical animals "stat-rats." 
When enough stat-rats are run we can obtain very good estimates of any 
constant of the p-value distribution that we wish for any trial number. 
The idea of the method is old. Before gamblers became so educated 
that they could compute probabilities or so wealthy they could hire 
mathematicians to do it for them, the standard method of estimating 
probabilities was to keep track of the outcomes of a large number of trials. 
In scientific work Student (William S. Gossett) used this method plus 
some intuition to obtain the sampling distribution of the correlation 
coefficient for small samples when the true correlation is zero [I]. Some 
time later R. A. Fisher demonstrated mathematically that Student's 
answer was correct. More recently such random number techniques 
have been called “Monte Carlo methods," and they are used in solving 


quite advanced problems in mathematical physics. . . 
We illustrate the method by applying it for 25 trials to two subject- 
controlled events with pọ = 0.2. 4 = 0.3, a = 0.6, a, = 0.01, «s = 0.9. 
In Table 6.1 the second column gives the p values for each trial. The 
third column is a set of 2-digit random numbers,* and the fourth column 
gives the operator to be applied on each trial. The 2-digit random 
numbers are of the forms 00, 01, 02. ^. 99. The number 00 stands for 
all decimal numbers from 0.00000 : > > , to 0.009999 ---. We choose the 
numbers at the low end of the scale to go with applying Qj. and the high 
numbers with Q,. Thus if the probability of Qı is 0.344 ona particular 
trial the random numbers 00, 01, 02, ^. 33 all call for application of 
Q; and the numbers 35, 36, ^^^. 99 call for application of Q,. The 
number 34 is indeterminate because it may mean any number from 
0.34000 ---100.34999-... For numbers from 0.3400 : - : to 0.343999 -+ + 
we wish to apply Q,, and for numbers from 0.34400 - a to 0.34999 . ES 
apply Qs. The ambiguity is resolved, not by throwing the num er 
away and getting another, but by adding additional random digits to the 
end of the number until a decision is reached. This is usually easy 
because it is convenient to lay out in advance the random numbers for 
several stat-rats at once in parallel columns. When a tie occurs we join 


numbers. If we 
probability in numerical 
animal through a sequence o 


* Numerous tables of random numbers are available [2, 3, 4, 5, 6]. 


130 APPROXIMATE METHODS cH. 6 


the first digit of the random number of the next stat-rat on that trial to 
the end of the number we already have, leaving the number for the next 
stat-rat unchanged. For example, if in the present tie the next digit is 
0, 1, 2, or 3 we must have a number less than 0.344 and we apply Qi: 
but, if the digits 4, 5, 6, 7, 8, or 9 appear, we apply Qs. It is not worth 
carrying more than two digits in the random numbers in the original 
layout because ambiguities seldom occur. 

Inspection of Table 6.1 clarifies the procedure used. A p value of 
0.2000 was chosen for trial 0; the first random number chosen was 84, 


TABLE 6.1 


Illustration of a computation sheet for 25 trials of one stat-rat. The operations 
are Q;p = 0.3 + 0.6p and Qsp = 0.01 + 0.9p, and py = 0.2. 


Trial Wwe Random Operat 
Number P Number Peer 

0 0.2000 84 [on 
1 0.1900 29 Qa 
2 0.1810 35 Qu 
3 0.1729 69 Os 
4 0.1656 53 O; 
5 0.1590 37 0, 
6 0.1531 05 on 
T 0.3919 50 Q, 
8 0.3627 57 O; 
9 0.3364 60 Qs 
10 0.3128 55 Oi 
11 0.2915 58 Qs 
12 0.2724 79 Qs 
13 0.2552 50 Q> 
14 0.2397 56 Qs 
15 0.2257 01 Qi 
16 0.4354 51 Qs 
17 0.4019 65 [on 
18 0.3717 92 Os 
19 0.3445 32 QO; 
20 0.5067 21 Qi 
21 0.6040 66 Qs 
22 0.5536 35 Q; 
23 0.6322 18 Qi 
24 0.6793 65 Q, 
25 0.7076 


which is above 20, and so Q, was applied to the initial p value and this 
gave a new p value of 0.1900. Then another random number, 29, was 
selected, and this was above 19, indicating again that Q, should be applied. 


SEC. 6.3 THE ASYMPTOTIC DISTRIBUTIONS 131 


This procedure was repeated and on trial number 6 the random number 
was 05, which was less than 15, the number corresponding to the p value 
of 0.1531. Hence Q, was applied to 0.1531, giving a new p value of 
0.3919. In Fig. 6.1 the small circles represent the successive means for 
84 stat-rats, each run through 25 trials, with the parameter values given 


above. 


A 


0.6 


o 
a 


Probability, p 


o 
P4 


5 10 15 20 25 


Trials, n 
at-rats for 25 trials. The small circles 
d in Section 6.7, 


bound computed 
from equation 6.68. Curves 
1, 6.72, and 6.73. All computations 
= 0.9, and po = 0.2. 


ained from 84 st 


Fig. 6.1. Mean p values obt 
and lower bounds, discusse 


denote these means, Also shown are upper 
for the true distribution means. Curve A is the expected operator 


from equation 6.69. Curve D is a lower bound computed 
Band Care the bounds computed from equations 6.7 
used the values a, = 0.3, à = 0.6, a2 = 0.01, *« 
In a sense, the procedure just described can provide us with all the 
available implications of our basic postulates. With enough patience, 
we could run hundreds of stat-rats for each set of parameter values that 
interested us. Only because we are not so patient as that do we bother 
with the mathematical analysis contained in most of Part I. On the other 
hand, stat-rat computations serve a slightly different purpose. As we 
have already suggested, a stat-rat is a sort of theoretical organism, a 
mathematical robot." It will generate sequences of “responses” similar 
to the sequences of real organisms. The adequacy of the model can be 
judged in part by comparing stat-rat sequences with experimentally 
observed sequences. In Part II we shall make such comparisons. 


6.3 THE ASYMPTOTIC DI 

The stat-rat procedure of making comput 
the approximate form of the asymptotic dist 
special cases. Two somewhat different met 


STRIBUTIONS 

ations is now used to obtain 
ribution for a number of 
hods are available. The 


132 APPROXIMATE METHODS CH. 6 


first involves making a large number of stat-rat runs—say a thousand—for 
a number of trials. When the number of trials is sufficiently large, the 
final p values obtained from each run form a distribution which approxi- 
mates the asymptotic distribution as closely as we like. The initial p value 
of these runs is arbitrary since the final p value is nearly independent of 
the initial p value for the cases we consider. 

For a given numerical example we would like to estimate how many 
trials are needed to approximate the asymptotic distribution with a specified 
accuracy. Suppose that we have decided to use a class interval of y in 
computing the approximate distribution. For example, we might be 
quite satisfied with percentage points of the cumulative distribution a 
distance of 0.01 apart. Suppose we have two sequences beginning at 
Po and pg’, but using the same sequence of random numbers. We would 
want to choose the trial number 7 so that p, and p,’ for these two sequences 
would lie within the same class interval or at least in neighboring classes 
of p values. Thus we want to choose » so that 


(6.1) [Pn — Pr’ |S 


SË 
For experimenter-controlled events, consider tw. 


o initial p values py and 
Po. We can easily show that 


Qi po — Qi py € po — Po’). 
Thus the difference between the two initial values is multiplied by some 
number 2; < l, depending upon which operator is applied on the zeroth 
trial. On the next trial, the difference P — pi is multiplied again by one 
of the x, Let f stand for the largest x, (i= 1, 2, ---, 1). Then on each 


trial the difference of the p values of the two sequences is multiplied by a 
number at least as small as p. Hence, for Po 79 Poa 


(6.2) Pu — Py. SB" po — po). 

Since all the asymptotic p values are between Amax and 2 
trapping theorem of Section 4.7, we would certai 
within these limits. 


min’ by oe 
nly choose py and po 


Hence p, — po’ is at most (max — Amin)» and so 
we would want n to satisfy 
(6.3) Pms — nin) y. 


Consider the following example: There are two operators with x, = 0.6, 
% = 0.9, 2, = 0.75, 2, = 0.1. Then B = 0.9, 2 


3 "max = 0.75, and Anin = 
0.1. If we select a class interval of y = 0.01, our inequality is 


(0.9)"(0.75 — 0.10) < 0.01, 
or 


(0.9)" — 0.0154. 
The smallest integer which satisfies this is  — 40 since (0.9) — 0.0148 


SEC. 6.3 THE ASYMPTOTIC DISTRIBUTIONS 133 


and (0.9)3° = 0.0164. Therefore we need 40 trials of stat-rat computations 
to be certain that the most extreme values of p, lead to final p values in the 
same or adjacent classes in the desired distribution. Ordinarily, a smaller 
number of trials would be adequate since random number sequences 


which would keep the p values as far apart as B"( Po — po) are rare. 


; 


F(p) 


0.2 


0 0.2 04 0.6 08 1.0 
p 


Fig. 6.2. The approximate asymptotic cumulative distribution of p values ob- 

tained from a 1000-trial stat-rat for the example of experimenter-controlled 

events. Q,p = 0.3 + 0.6p, Qip = 0.01 + 0.9p, m = 7e = 0.5. The vertical 
dotted lines show the trapping limits 42 = 0.10 and 4, = 0.75. 


For two subject-controlled events, it is rather difficult to obtain a good 
estimate of the number of trials necessary to obtain the desired approxi- 
mation to the asymptotic distribution, so we do not attempt it here. 

The method discussed above is rather wasteful because only the p value 
on the last trial of each stat-rat is used to approximate the asymptotic 
r, the ergodic theorem for single sequences. stated 
a much less wasteful procedure. Accord- 
(that is, a single stat-rat) will 
at sequence becomes infinitely 
s to estimate how long the 


distribution. Howeve 
in Section 4.7, provides us with 
ing to that theorem, a single sequence 
generate the asymptotic distribution when th 
long. The only practical problem then i 


Sequence must be to give a reasonable approx 


imation. The main criterion 
is to let the total number of trials be large compared to the number of 


In the previous numerical example 


trials necessary for a single stat-rat. : 
. and so a single stat-rat of say 


we needed at most 40 trials per stat-rat 
1000 trials might be adequate. 


134 APPROXIMATE METHODS CH. 6 


We have used the single stat-rat procedure to obtain the asymptotic 
distribution for a number of numerical examples. In Table 6.2 and Fig. 
6.2 we give the results for a case of two experimenter-controlled events 


TABLE 6.2 


Frequencies of p values which occurred in a 1000 trial stat-rat with 
Qip = 0.3 + 0.6p, Q,p — 0.01 + 0.9p, m = 54 — 0.5. All P 
values were rounded to two decimals. 


P f PpP f p F 
<0.26 0 0.43 6 0.60 33 
0.27 1 0.44 3 0.61 53 
028 0 0.45 T 0.62 58 
0.29 1 0.46 7 0.63 23 
0.30 0 0.47 13 0.64 58 
0.31 1 0.48 10 0.65 44 
0.32 1 049 11 0.66 — 36 
0.33 1 0.50 8 0.67 65 
0.34 1 0.51 28 0.68 ky 
0.35 0 0.52 17 0.69 28 
0.36 1 0.53 20 0.70 55 
037 2 0.54 28 0.71 44 
038 2 0.55 18 0.72 30 
0.39 6 0.56 32 0.73 32 
0.40 3 0.57 36 0.74 46 
0.41 3 0.58 22 0.75 0 
0.42 9 0.59 38 1000 


and Q, p = 0.3 + 0.6p, Qs p = 0.01 + 0.9p, 7, = 74 — 0.5. In Section 
4.3 we computed the asymptotic mean and standard deviation from the 
exact moment formulas for this numerical example. We obtained 
V1. = 0.62 and o, = 0.08. From the 1000-trial stat-rat used in ob- 
taining Table 6.2 and Fig. 6.2 we obtained a mean of 0.6185 and a standard 
deviation of 0.086. This close agreement provides a further check on the 
usefulness of the method. 

For two subject-controlled events we computed an example where 
Q; p = 0.3 + 0.6p and Q, p = 0.01 + 0.9p. The results are shown in 
Fig. 6.3. From these computations we obtain an asymptotic mean of 
Vi, = 0.6731 and an asymptotic standard deviation of o,, = 0.0784. 
In Section 6.8 we compare this result for the mean with some computed 
bounds on V, , for the same numerical example. 

In Fig. 6.4 we provide another example of the asymptotic distribution 


SEC. 6.3 THE ASYMPTOTIC DISTRIBUTIONS 135 


for two subject-controlled events. This example is for the operators 
Q, p = 0.3 + 0.6p and Qs p = 0.06 + 0.7p (A, = 0.75 and A, = 0.20). 
From the computations we obtained a mean Vj 4, = 0.513 and a standard 
deviation c, = 0.173. 

The equal alpha condition discussed 
for the moments, but it is still a great 
the asymptotic distribution from those moment 


in Chapter 5 led to exact formulas 
deal of labor to obtain the form of 
formulas. Therefore, 


10 


0.8 


0.6 


F(p) 
04 


0.2 


--LE-4--L-J]-4--r-4--t-4 


08 1.0 


0 0.2 0.4 0.6 


Fig. 63. The approximate asymptotic cumulative distribution of p values ob- 
nple of subject-controlled events. 


tained from a 4000-trial stat-rat for an exan ) 
The vertical dotted lines show the 


Q,p = 0.3 + 0.6p, Q;p = 0.01 + 0.9p- 
trapping limits 2, = 0.10 and Ay = 0.75. 


we provide two examples of distributions obtained by the single stat-rat 
Q.4 4- O.5p and Q»p = 0.1 + 0.5p 


procedure. The first, for Q1? = 5 
(A, = 0.8 and Ay = 0.2), is presented in Fig. 6.5. The observed mean 
2 0, computed from 


was 0.4826 as compared with the true mean, 0.50 C 
asymptotic formula 5.30. The observed standard deviation was 0.2464 
as compared with the true value, 0.2236, obtained from asymptotic 


formula 5.48. Finally, in Fig. 6.6 we give an illustration for Qı p = 0.5p 
and Q, p = 0.5 + 0.5p (4, = 9 and 4, = 1), an example of the special 
case discussed in Section 5.7. The true distribution parameters obtained 
from equations 5.58 and 5.63 are V, = 0.5000 and So = 0.2236. 
The values obtained from the single stat-rat computation were fas = 


0.4958 and o = 0.2227. 


136 APPROXIMATE METHODS cH. 6 
10 T 
1.34 | | 
| | l1] 
08 | t- 
| " f 
| | \ 
0.5 [— — —]4 —//7A—14 1 | 
F(p) | = 2 en S 
0.4 1 t 
| 
: = 
| 
02} + L ] 
| 
| |e |e 
a a on on on 
0 l M 
0 0.2 04 0.6 08 10 


Fig. 64. The approximate asymptotic cumulative distribution of p values 
obtained from a 1000-trial stat-rat for another example of subject-controlled 
events. Qip = 0.3 + 0.6p, Q,p = 0.06 + 0.7p. The vertical dotted lines show 
the trapping limits 2, — 0.20 and 4, — 0.75. Note that this distribution is much 
more symmetric about the center than the distribution in Fig. 6.3. 
parison, a straight line, corresponding to a uniform distribution, is 
The histogram below the cumulative diagram gives 
in class intervals. 


For com- 
also shown. 
à rough picture of the density 


SEC. 6.3 THE ASYMPTOTIC DISTRIBUTIONS 


10 T T ] 
08 i =] 
| 
I 
0.6 iF | 
| 
F(p) [^i 
04 t 
02 i t 
| 
05 02 04 0.6 08 1.0 
' p | 
fl | 
I 


asymptotic cumulative distribution of p values, 
at-rat for the equal alpha case of two subject- 
a = 0.5. The histogram below 
f the density in class intervals. 


Fig. 6.5. The approximate 
obtained from a 4000-trial st 
controlled events, with 2, = 0.8, 22 = 0.2, and 
the cumulative diagram gives a rough picture o 


137 


138 APPROXIMATE METHODS cH. 6 


1.0 


0.8 


0.6 


F(p) 


0.4 T 


0.2 


0 02 0.4 0.6 0.8 1.0 
p 


Fig. 6.6. The approximate asymptotic cumulative distribution of p values, 
Obtained from a 2000-trial stat-rat, for the equal alpha case of two-subject- 
controlled events, with 4, — 0 and Ža = 1. This case is discussed in Section 5.7. 
For the computations, Qip = 0.5p, Qp = 0.5 -+ 0.5p. The straight line 
corresponds to a uniform distribution on the interval from zero to unity. 


6.4 THE EXPECTED OPERATOR 


In Section 3.12 we considered a rather obvious device for computing 
means from trial to trial for two subject-controlled events, and we consider 
it again here. This procedure involved computing the mathematical 
expectation of the p value on trial 1, using this mean p value for the 
probability of application of Q, on trial 1 to obtain the mean p value on 
trial 2, etc. The analogous procedure did yield the correct means for 
experimenter-controlled events, but, as we saw in Section 3.12, it did not 
work for subject-controlled events. The one exception is the equal alpha 
case described in Chapter 5. Nevertheless, we consider the procedure in 
this section without the equal alpha condition to provide a possible 
approximation scheme. E l 

As in Section 5.4 we define an expected operator Q, for two subject- 
controlled events, by the equation 
(6.4) QT = Vi, 0: ^a t de Via) Qs, 

Using the slope-intercept form for the operators, Q;p = a, + a, p, we have 
QW. = V, (d, + 93 Vio) t O — Vi, Xa, + %_V, ,) 


T 


ie —— M NR, 


SEC. 6.4 THE EXPECTED OPERATOR 139 


We then assume that this expression gives an approximation to V, ny1 
that is, 
(6.6) Vana Vin 


We thus have a quadratic difference equation which cannot be solved by 


elementary means, but we can study some of its properties. 
First consider the asymptotic solution for which QV; ,, = Vio» We 


have 


(6.7) V, o = ds + (ay — az + 3) Vi s + (Hr — 3) VT 


The solution of this quadratic is 


ET (1 — a, + ag — %) + va a, + ay Z)? — Aas(t aa) 
uin: Vis RE 2(% — xs) z 


Usually only one of the roots falls within the possible range, and so it is 
easy to decide which sign before the radical is appropriate. This approxi- 
mate formula for the asymptotic mean is sometimes useful. As we see 
in the next section, it provides either an upper or a lower bound on the 
true value. 

Equation 6.6 may be used for trial by trial numerical computations, 
but it is often more convenient to consider the corresponding differential 


equation which is obtained by using the further approximation 


9. 
(6.9) s ca DV o Wes. 
dn d S 


(Cf. Section 3.4.) We then have 


(6.10) ais c dy + (ay — ag + s — IM + 68 — te) VF ns 
In 

This equation is integrated directly [7] to yield 

1—8 p l4 ce 
6.11 E n’ 
( ) Pin 30 — aj) ez 2m — ay) 1 — Ce? 
where 

B=a,— ay + €» 

(6.12) p = VU — By. — 4as(x, — %2), 


2@, — es) ho — (1 — 8) — P. 
(aq — eg) Vio — U — Bp 


140 APPROXIMATE METHODS cu. 6 


In the limit as n—> oo, the ratio (1 + Ce?)/(1 — Ce”) approaches —1 
and so 

, ]|—H8-— 

4 Vio = >, 
(6.13) 1, We ae 


in agreement with equation 6.8 above. 

Consider our numerical example for which a, = 0.3, a, = 0.01, 
9, = 0.6, % = 0.9, and Vj, = py = 0.2. We then get from the above 
definitions 6 = 1.19, p = 0.2193, and C = —0.5161, and so Vy, c 
0.6822. For the mean on trial n we have 


—" 0.5 1 61 e?.21935 


l 
B) 73, 50/3167 ——0:3658..— — 20 i 
l 1 0.516100210937 


For experimenter-subject-controlled events with two response classes 
and s possible outcomes, we may define an expected operator Q by 


(6.15) QV,, = Van È miu, EX -— Vin) b 75, Qai V, s 
Em k=l 


where the conditional probabilities satisfy the conditions 
(6.16) Lap =1 (fed 


The operators are defined by the expressions 


(6.17) Qj, V,, = aj + AV, 


1m 


When these expressions are inserted in the equation defining Q and the 
following abbreviations used, 


(6.18) 


we obtain the equation 


(6.19) ÖV; n = dg + (dj — da + F2)Vyn + (Gi — Fy) VF, 


This equation is similar in form to equation 6.5 for subject-controlled 
events. Therefore, the approximate solution of that equation generalizes 


SEC. 6.5 BOUNDS ON THE ASYMPTOTIC MEAN 141 


at once to the solution of equation 6.19, provided that we make the 
obvious parameter substitutions. Equations 6.11, 6.12, and 6.13 become 


B 1 4 Ce? 
2 = 
(6.20) ue 
(6.21) 
(6.22) p= VU — py — 4a, — R9. 
= 2G, —%)Vin —(1— A) — 5 
6.23 = Z = =) 
iine: C= Bi — ig — — Pt P 
(624 — Ag LTP 


*65 BOUNDS ON THE ASYMPTOTIC MEAN 

We have just indicated that the expected operator usually gives only an 
approximation to the distribution mean. But we have no way to know 
how close the approximation is without computing the exact means. 
Therefore it is useful to obtain some bounds on the mean, that is, a quantity 
which is always greater than the mean and another quantity which is 
always less than the mean. When these bounds are close together we 
have a good estimate of the true mean. In this and the following section 
we restrict our attention to the asymptotic distributions for two subject- 


controlled events. " 
The recurrence relation, equation 4.52, gives for the mean, V, ,,,, on 


the (n 4- Dst trial 
(6.25) Vyn = Cio F Cui + GeYen 
The coefficients are defined as before by 
Cio = aa 
(6.26) Cii = 4 — dg Fap 
For very large 7, we may take Vin = Vi, = Vi = Vi; also Va, = 
Vani = Von = Va. Thus we get 
(6.27) Cy + (Cu — DI + Cia Va = 0. 


Now if we knew the second raw moment J^, there would be no pepe 
we could obtain V, immediately. However, Vg is not known, but we 


142 APPROXIMATE METHODS cH. 6 


might expect to obtain upper and lower bounds on V, by replacing V2 
with its upper and lower bounds. The task then reduces to finding 
bounds, V and V,", on Vy. Of course we can say at once that 

(6.28) 0 x M sS T, 


Since our asymptotic distribution lies on the interval from zero to unity. 
If x is the variable along that interval and if f (x) is the probability density 
function,* the second raw moment is by definition 


1 
(6.29) y, Í xf (x) dx. 
o 
The variable v is at most unity, and, moreover, 
1 
(6.30) i Sf (x) dx = 1. 
0 


Hence we obtain inequalities 6.28. But we can do better than this, for 
we know from the trapping theorem of Section 4.7 that the asymptotic 
distribution must lie between 2, and Àj. For present purposes we assume 
that 2, < 4,, but we show later that interchanging A, and 2, does not 
alter our final results. We therefore require that 0 SA, Ses AE 
From equation 6.29 we then see that 

(6:31) AP x Y, AS. 

These limits on V, could be used in the relation 6.27, but ordinarily they 
will provide no better bounds on V, than the ones we already have, 
namely, 

(6.32) As € V, x A. 


From the foregoing discussion it should be clear that we are not satisfied 
with just any bounds on V,—we are looking for bounds that are close 
together. In particular, we should like to find bounds on V, for a given 
value of V;: How large and how small can V, become for any distribution 
between A, and 4, and with mean V}? The lower limit on V, is easy for 
we know that (see Section 4.2) 

(6.33) V, = VE ot, 


Where c? is the variance about the mean V4. This variance is never 
negative and so we have 


(6.34) Vp > Vg. 


In order to ascertain whether this is the best possible lower bound on Vs 
we ask whether there exists a distribution on the interval 2, to 4, with 


* For simplicity of exposition we are using the density function instead of the 


cumulative. We point this out because the asymptotic density function may not exist 
though the cumulative does. 


SEC. 6.5 BOUNDS ON THE ASYMPTOTIC MEAN 143 


mean V, and second raw moment V, = V,?. If such a distribution does 
exist, we know that the bound is the best available unless further restrictions 
on the distribution are specified. The problem, then, is to find a distri- 
bution, if possible, with V, — Vj?. This is accomplished immediately, 
for a distribution having all its density at V}, where 4; < V, X A, has a 
second moment equal to V,?. Hence, we have obtained the best possible 
lower bound on V, for given Vy. 

The best possible upper bound on V, for given V, isa little more difficult 
to obtain. We begin by considering a distribution g(z) where 0 < 2 IE 


The mean, Uy, is 


$ 
(6.35) Ü= Í zg(2) dz, 

0 
and the second raw moment is 

1 
(6.36) U, = Í z?g(z) dz. 

0 
And, since z is between zero and unity, we have 

1 1 
(6.37) [ z2g(2) dz < I zg(2) dz, 
Jo 

and so 
(6.38) Us Ur d 
We now transform this distribution to the interval from A, to A, by letting 

;— À 

q— ^2 
(6.39) as m ^ 

: 5 ition for 

Eae jJ. and à necessary condition 

los call the transformed distribution f g = ge) dz. We need the 


m ing 
aking the transformation is tha 
e moments. 


[ ha = À p(x) de 


DES =h 


tr; : 
ansformation equations for th 


1 
U,= | goe . 
0 


6 À 
( 40) 3 a 1 fo di. 
p ["uodr7$ —43 
= J, — Ag da assi. eee terval from 
à, — Aa Je inter 
Th : n -tribution on the In 
€ first integral is the mean V; for our var so is unity. Hence, we 


t : " 
héve ‘13 the last integral is the total dens 


(6.41) "T€ uc 
B MELLE. 
Y the same procedure we obtain " 
+ As 
(6.42 ¥,— 28 t. 
) Us = (s a Ae? 


144 APPROXIMATE METHODS cH. 6 


We now insert these last two relations in the inequality 6.38 and obtain, 
after multiplying through by (A, — 23)*, 


(6.43) Va — 24V, + A? < (V — A3Y(4 — A). 
After re-arranging, we have 


(6.44) Va < (1 + Ag) Va — As. 


Fig. 6.7. A distribution having all its density at the limits 2, and dus 


This is the upper bound on V, we are looking for. We can establish that 
it is the best possible upper bound by finding a distribution on the interval 
between A, and A, having mean V, and second moment, 


(6.45) VU, AQ — As. 


Such a distribution is the one shown in Fig. 6.7. If the density at A, is 


u, and the density at A, is | — i4, we have for the mean and second raw 
moment 


(6.46) V, = mA, + (1 — ugs 


(6.47) Va = u2? + (1 — )AQ?. 


Solving equation 6.46 for u, we obtain 


Eo 3 
(6.48) u = dc 
Ay — Ag 

If this result is substituted into equation 6.47 and simplifications are 
made, we at once obtain the upper bound for V, given earlier in equation 
6.45. Therefore, we have found a distribution on the interval from 
A, to 2, with mean V, and second moment as large as the upper bound 


SEC. 6.5 BOUNDS ON THE ASYMPTOTIC MEAN 145 


permitted by the inequality 6.44. Therefore, we have found the best 

possible upper bound on V; for given Vy. Combining the upper and lower 

bounds, we have finally the 

BOUNDS ON V3: 

(6.49) Vg < V, < QS + A), — As 

We observe that if å, and A, are interchanged we do not change the bounds 

on Va and so the results apply when 2, < 2; as well as when dy < A. 
We now use these bounds on V, in equation 6.27 to obtain bounds on 

the mean V, We first consider the lower bound. In equation 6.27 

we replace V, with V,? and see that, when Cj, is positive, 

(6.50) Cio + (Cu — DM + GV? € 0 (0: Cia), 

and, when C; is negative, 

(6.51) Cro + (Cii — DA + GV? z0 (Gia = 0). 

These inequalities place some restrictions on V, We denote the quadratic 

expression appearing in these inequalities by J( V), that is, 

(6.52) IV) = Cio + (Cu — DA + Ci. 

We see at once that when V, = 0 we have J(0) = Cio = a» which is 

Furthermore, when V, = 1 we have J) = € + (Cii — D + 


positive. 

Cy», and from the definitions of the C's we get J(1) = (a; — d2 +a— D) - 
(a4 — &g) + dg = % + ay — 1. Using the additional fact that a, — 
A(1— %), we have J(1) = —(1 — A( — 2), and this is a negative 


quantity. Thus we have established that the quadratic function J is 
positive at zero and negative at unity, and so it follows that J is zero for 
only one value of V, between zero and unity. We denote this root by 
Y,thatis,J( Y) = 0. In Fig. 6.8 we show an example of J( VÀ) fora, — 0.3, 
dy = 0.01, a, = 0.6, and a; = 0.9, giving the function 
(6.53) J(A) = —0.30V,? + 0.197, + 0.01. 


= 0.68. 


We observe from this figure that J is zero at about Y 
to obtain 


We may solve the quadratic equation J( Y) — 0 for Y 


——————— 

ad Cy) va Cu? 4Ciy Cio 
6. s 
(6.54) Y E 


t this expression is identical to the approximate 
formula 6.8 obtained from the expected operator approximation. We 
see from Fig. 6.8 that when J( V) is positive we have V, « Y, and when 
J(VA) is negative we have Y < y, Butwe already know from inequalities 


It may be observed tha 


146 APPROXIMATE METHODS cH. 6 


6.50 and 6.51 that J(V4) is positive when Cj, < 0, that is, when 4; < %2 
and that J(V4) is negative when 0 < Cj, that is, when z, < z,. Thus we 
have 


Vus Y dor G& < ty, 
(6.55) 
Vi Y for eo. 


This is one of the desired bounds on Vj. 


H(V,) and J(V,) 


W—- 


Fig. 6.8. Illustration of the functions J(V;) and H(V,) plotted from equations 
6.53 and 6.57 with the parameters a, = 0.3, , = 0.6, a; = 0.01, % = 0.9. The 
two bounds, Y and Z of equations 6.61 and 6.62, are also indicated. 


We next investigate the consequences of the upper bound on V, given 
by inequality 6.49. We replace V, by the expression (Ay + 45)V4 — 2122 
in the left side of equation 6.27 and obtain a function 


H(V4) = Cio + (Cj — DV + Cif + 43) V4 — 4443] 
= (Cio — 2122012) + (Ci — 1) + CyolAr + AM 


(6.56) 
Using the definitions of the C's and the relation a; = A(1 — «;), we can 
write this function in the form 


H(V4) = AS — AU — aa) + A( — 24)] 
[s 04) + (1 Ay) — a)l 4. 


(6.57) 


SEC. 6.5 BOUNDS ON THE ASYMPTOTIC MEAN 147 


When written in this form we see that the slope of the line H( V4) versus V, 
will be negative and H(0) will be positive. We call Z the point at which 
H is zero, that is, H(Z) — 0. An example of the function H(Vj) is given 
in Fig. 6.8. 

Since V, is less than or equa 
for 0 < Cys, that is, for x3 < %4- 
and so 


1 to (Ay + Ag) V4 — 212 we have H(V4) = 9 
But H(V,) = 0 implies that y, € Z, 


(6.58) VQxZ for xw. 


Similarly, H(V4) < 0 for Ciz < 0 or a, < zs, buc HV) S9 implies that 


V, > Z; and so 
(6.59) V,>Z for ow če 
The last two inequalities provide the other bound on /,. We obtain the 
value of Z by solving the equation H(Z) = 0. 

We can summarize the preceding results to obtain 
BOUNDS ON V; FROM BOUNDS ON V: 


REV ZY fet 1%, 


6.60 

; Y¥<V,<Z for æ es 

where Sea = 
a= Gi Vd = Cu) = 4CiaCjs 

.61 Y : 

(6.61) = 

(6.62) Z= iC — Cio 

(€, =F Cal + 42) 


and the C’s are defined by equation 6.26: 


Cy = % — 02 + Boy 


Cj; = 9 — e 
= 0.3, «4 = 0:6, a= 0.01, and 


For the numerical example of 4 = 
= 1,19, and. Ge» == 0.30: 


tty = 0.9, we have 2, = 0.75. 22 = 0.10, Cu 
Thus we get for the bounds 


0.500 < V, < 0.682. 
he approximate value of 0.673 


bution shown in Fig. 63. In 
al examples of these bounds. 


These bounds may be compared with t 
obtained from the single stat-rat distri 
Table 6.3 we show several other numeric 


148 APPROXIMATE METHODS CH. 6 


TABLE 6.3 


Bounds on the asymptotic mean, V, =, for seven numerical examples. The 

limits 4, and 2, are the asymptotes obtained by applying one operator only. 

The bounds on V,» were obtained by maximizing and minimizing the second 
raw moment, J^, and the third raw moment, V3. 


Parameter Values Limits [ Bounds on x, 


a ay ay s | As Ay From V, From V 


0.300 0.010 0.6 0.9 | 0.10 0.75 | 0.500-0.682 0.655-0.676 


0.300 0.001 06  09]|0901 0.75 | 0.112-0.668 —— 0.418-0.658 
0.300 0.000 0.6 0.9 | 0.001 0.75 0.013-0.667 ^ 0.093-0.654 
0.396 0.001 0.6 0.9 | 0.01 0.99 | 0.394-0.987  0.967-0.986 


0.356 | 0.003 06 0.7] 0.01 0.99 | 0.184-0.961 — 0.723-0.878 
0.360 0.03 0.6 0.7] 0.10 0.90 | 0.557-0.718  0.637-0.657 
0.300 0 06 0.9] 0 0.75! 0 -0.667 0 -0.656 


*6.6 IMPROVED BOUNDS ON THE 
ASYMPTOTIC MEAN 


In the preceding section we found the best possible bounds on V, for a 
distribution on the interval from A» to 2, with mean Vi. These bounds 
on V, led to the bounds Y and Z on V4. Better bounds on V, can be 
obtained from the best possible bounds on the third raw moment V, for 
given V, and V,.* By arguments similar to those already used, it has 
been shown [8] that for a distribution on the interval 0 <z < 1, the third 
raw moment U; satisfies the conditions 


(6.63) 


A U, — Uy 
SU, x Uy 4-2 — EXT. 
1 U, * l 
where U, is the corres 
When this distribution 
the conditions become 


ponding second raw moment and U, is the mean. 
is transformed into one on the interval A, <a < A, 


Eon (Fs — AV, Bs — AKP 
(6.64) Aah 2+ ae € V, e AV, EE ( 2 1) 
1 -2 


These bounds on V, can be inserted in the recurrence formulas developed 
in Chapter 4 to eliminate V; and thereby make it possible to solve for 
bounds on V,. In Section 6.8 we give formulas in full and illustrate the 
computations; in Table 6.3 we show the bounds for several numerical 
examples. 

* We are indebted to Lotte Bailyn for suggesting this procedure, 


i 


SEC. 6.7 BOUNDS ON THE PRE-ASYMPTOTIC MEANS 149 


*6.] BOUNDS ON THE PRE-ASYMPTOTIC MEANS 


The last two sections were restricted to a discussion of the asymptotic 
distributions for two  subject-controlled events. The restriction. to 
asymptotic distributions simplifies the exposition, but the development is 
readily extended to give bounds on the mean for any trial n. In this 
section we give the results for any trial, but the derivations are omitted 
because they are tedious though straightforward. 

The distribution means J,, lie between two bounds Z, and Y,. 


Analogous to inequalities 6.60, we have 


Zn € V € Yn for e < Xs 


(6.65) 
Y, S Vin EZ, for n «m. 


The bound Z, can be computed from the recurrence formula 


(6.66) Zra = eZ, + (1  «)Z, 
provided that the parameter «’, defined by 

1 — «9 1— wy 
(6.67). a eg ur apa, 


is nonnegative. This recurrence formula is readily solved to give 


(6.68) Z, = a"Z, + (1 — «"Z. 


ation is the expression given by equation 
When’ is negative, these equations 
The initial value Z, is of course 


The limit point Z in this equ 
6.62, and it may be seen that Zo = Z. 
may not yield a bound on V,» for all n. 
set equal to V, ,. 

The other bound Y, is obtained from 
in Section 6.4. We have 


n the expected operator discussed 


(6.69) Ypa = dg + (8 — 42 ay) Y, + (%1 — 29) Ya 


provided that 
(6.70) (a, — ay + 2) + (or — 9) Y, Z 0 


for all values of Y, in the range of interest. When this condition is not 
satisfied, the recurrence formula for Y, may not yield a bound on Vin 
We take Y, — V,, and it is readily shown that Y, = Y, where Y is 
given by equation 6.61. The recurrence formula for Y, cannot be solved 
by elementary means, but the differential equation solution, equation 6.11, 
can be used as an approximation. In Fig. 6.1 we show the bounds Z,, 


and Y, for a numerical example. 


150 APPROXIMATE METHODS cH. 6 


The bounds obtained from the upper and lower bounds on V, can also 
be extended to any trial n. We use the bounds 
Wan — 44V Y 


(6.71) Ee ds) s mes " 


and 


(6.72) ve 


3a 7 


dp ua D AL 
"1" 2 > 
$ Vin— Ay 


in conjunction with the recurrence formulas 
Vins = Cio i Cu V, = Cron 
Vo na = Cao 4 Cu Vin T Cos Vs = Cag V. 


3,n* 


(6.73) 


to obtain upper and lower bounds on V, ,. These equations can be solved 
to give recurrence formulas in the means alone, but we will spare the 
reader the sight of the result since a straightforward computation pro- 
cedure requires only the preceding four equations. We begin with values 
of Vio and Voo, and compute /,, from equation 6.73 and Fao from 
equation 6.71. The latter result is used in equation 6.73 to obtain a 
bound on V,;. The value of V, and the bound on V; are then used in 
equation 6.71 to obtain a value of Vi), and the procedure is repeated. 
Such computations are very laborious indeed but can be mechanized 
readily if there is serious interest in obtaining the bounds on the means 
Vin In Fig. 6.1 we give an example of these bounds. 

The entire analysis of bounds in this chapter was restricted to two 
subject-controlled events. The derivations can be generalized to experi- 
menter-subject-controlled events in a straightforward way, but the formulas 
become lengthy and the computations tedious. Therefore we do not 
display them. 


6.8 SUMMARY 


In this chapter we present some approximate methods for determining 
properties of the distributions of p values. Most of the discussion is 
restricted to two subject-controlled events, but no special restrictions are 
placed on the parameters of the two operators Q; and Q,. 

A random number scheme for making approximate computations is 
discussed in Section 6.2. These "stat-rat" or “Monte Carlo" runs lead 
to estimates of the moments of the distribution of p values and are useful 
when exact computations are very laborious. In Section 6.3 we use the 
method to obtain the approximate form of the asymptotic distributions 
for several numerical examples. 

The expected operator Q, which was used for the equal alpha case in 


SEC. 6.8 SUMMARY 151 


Chapter 5, is considered for the more general case of xı 4 « in Section 
6.4. Approximate explicit formulas for the distribution means are devel- 
oped from the expected operator for subject-controlled and experimenter- 
subject-controlled events. 

Sections 6.5, 6.6, and 6.7 discuss some upper and lower bounds on the 
means. In Section 6.5 we establish that the asymptotic mean V, satisfies 


inequalities 6.60. 
In Section 6.6 we mention an improved pair of bounds Vy’ and V 


on the asymptotic mean V, These bounds are solutions of the two 

quadratic equations 

Da, + Da Vi + Dahi = 0, 
0 


(6.74) pm ý 
Dio + Daly" + Diei 


where the coefficients are for i= 1, 2, 
22; + CioCi(C22 — 1+ Cas 


Dj = Ca C — AiCu) — CCCs = 1— Casi) 
(6.75) (Cy — DICA (Cos — 1 Coghi) + 2CosCrol + 
Digs Qu Ca T CA) — Cis Coa — | —Cs4(G;-— D 
F Cy €n — 1) 
and where 


Gym Og Coy = aj? — ag + 28st». 

(6.76) $0 — d^ 2 

— 2 — 2 2 

Com 2( ae, aoko) + 3^ Cog — 0 ao. 
22 1a 2X2 


To illustrate the procedure for obtaining the two foregoing sets of 
bounds on the asymptotic mean, we consider the parameters d; — 0.3, 


ay = 0.01, 2, = 0.6, and «s = 0.9. From the relations 4; = a//(1 — a) 
we see that 2, = 0.75 and Ag - 0.10. We then get the coefficients 
Co= 0,01, Cuz 1.19, and Cig = —0.30. Using these numerical 
results in the equations for Y and Z, we have the bounds expressed by 


0.500 — V, < 0.682. 


For computing the improved pair of bounds, the necessary coefficients 
are Cay = 0.0001, Cy, = 0.1079, Cog = 1.152, and Cos = —0.45. These 
values then give Dy 9 —0.78 X 103, Do, = —26.91 X 107, and 
Dox = 42.90 x 10-4. Thus the quadratic equation for Jj is. after 
multiplication by 10*, 


—0.78 — 26.91V' + 42.90 n? = 


152 APPROXIMATE METHODS CH. 6 


The positive root of this quadratic is 0.6550. In a similar way, we 
obtain for the quadratic in P. 


3.65625 + 4.14375 V," — 14.1375V,"2 = 0, 
The positive root of this quadratic is 0.6758, and so the final result is 
0.655 — V, < 0.676. 


The corresponding stat-rat mean, obtained in Section 6.3, is 0.6731. 

In Section 6.7 we describe a computation procedure for obtaining the 
bounds on the Pre-asymptotic means. It is stated that under certain 
restrictions two operators, when applied repeatedly to Vis generate 


upper and lower bounds on V, ,. The reader is referred to that section 
for the details. 


REFERENCES 


l. Student. The probable error of a correlation coefficient. Biometrika, 1908, VI, 302. 

2. Tippett, L. H. C. Random sampling numbers, tracts for computors, No. XV. 
London: Cambridge University Press, 1927. 

3. Kendall, M. G., and Smith, M. B. Random sampling numbers, tracts for computors, 
No. XXIV. London: Cambridge University Press, 1939. 

4. Fisher, R. A., and Yates, F., Statistical tables. Edinburgh: Oliver and Boyd, 1939. 


5. Snedecor, G. W. Statistical methods. Ames, Iowa: Iowa State College Press, 1946, 
pp. 10-13. 


6. Arkin, H., and Colton, R. R. Tables for. statisticians. 
Noble, 1950, pp. 142-145, 


7. Peirce, B.O. A short table of integrals. 
edition, p. 10, integral 68. 

8. Bush, R. R., and Mosteller, F. A stoch: 
Annals of math. Star., 1953, 24, 559-585. 


New York: Barnes and 
Boston: Ginn and Co., 1929, third revised 


astic model with applications to learning. 


CHAPTER 7 


Operators with Limits 


Zero and Unity 


7.4 INTRODUCTION 


In many experimental situations "perfect" learning is possible and is 
nearly achieved by many subjects. If a particular response occurs 
repeatedly and is rewarded each time, that response may tend to occur 
with certainty. In a choice situation, one alternative may be rewarded 
while the others are not; we might expect that the organisms would tend 
to choose the rewarded alternative with probability one. For such 
problems, the model would use event operators that carry a response 
probability p towards an asymptote of one, that is, operators with limit 
points 2 — 1. From the fixed-point form of the operators, we can write 


(7.1) Q,p—e«p--(1—«) (A= 1). 
Likewise, we are interested in operators which tend to make the response 
probability zero—complete extinction may be possible—and so we 
consider operators for which 4; = 0, that is, operators of the form 
(7.2) Qip = «p (2; = 0). 
The discussion in this chapter is restricted to two subject-controlled events. 
_ From a mathematical point of view there are four main cases with 
limits zero and unity for two events: 
(D A=, A=09, 
(Il) 4=0, 4 
(III) 44 = 45 = 
(IV) 2, = Ag — 0. 
The latter two cases are discussed in the next chapter under the more 


general case of 4, = 44 = 4. In this chapter we discuss only the first 


153 


154 OPERATORS WITH LIMITS ZERO AND UNITY CH. 7 


two cases listed above. In both, operator Q; is applied with probability p, 
and operator Q, is applied with probability 1 — p, since the events are 
subject controlled. 

Case I above involves the operators 


Qip = «p (0 — 2). 
(7.3) 
Qap = tap. 


The first operator moves p toward p= 1, and this operator is applied 
with probability p. Therefore, if we should ever achieve p= l, the 
operator Q, is applied with certainty, and so p = 1 is a stable point in 
the process. We refer to p= 1 as an "absorbing barrier."* Similarly, 
Qs moves p toward p = 0 and is applied with probability 1 — p. Thus, 
if p = 0, Qù is certain to be applied and p remains at zero. The point 
p = 0 is another stable point of the process, and we call it an absorbing 
barrier also. With operators of this form we might expect that all p 
values would be "absorbed" at p= 1 or p= O in the limit as 1 — co. 
We shall find that this is exactly what happens. 
Case II listed above leads to the operators 


Qi p — & p, 
Qsp = xs p + (1 — ag). 


(7.4) 


This is the reverse of Case I just discussed. The first operator moves p 
toward zero, and the second moves p toward unity. But Q, is applied 
with probability p; therefore, if we ever had p = 0, Q, would be applied 
with certainty, and so p would not remain at zero. In the same way we 
see that if p = 1, then Q; is certain to be applied, taking p below unity. 
Hence, in Case II, p — 1 and p — O are not absorbing barriers but instead 
are “reflecting barriers." ‘ 

The Brunswik T-maze experiment again supplies examples for which 
the two cases above may be appropriate. If both right and left turns are 
rewarded on every trial they occur, the operators of equations 7.3 could 
be used. Turning right (response 4,) and finding food increases p. the 
probability of turning right, whereas turning left (response 45) and 
finding food increases the probability of turning left, and so decreases p- 
If, on the other hand, both right and left turns fail to lead to reinforcement. 
the operators of equations 7.4 seem more appropriate. Extinction of 


— 

* [n most physical problems involving an absorbing barrier, absorption is possible ina 
finitetime. This is not the case here; “absorption” occurrs only in the limit as  — eo 
except when o, = 0 or x, = 0. 


SEC. 7.2 THE ASYMPTOTIC DISTRIBUTION FOR CASE I 155 


turning right decreases p, and extinction of turning left decreasesq — 1 — p 
In the following sections we consider some of the 
mathematical properties of Cases I and Il. In Chapter 5 we handled 
these cases when z, = 4s; since there are many problems for which we 
would not care to make the equal alpha assumption, we treat unequal 


alphas here. 


7.2 THE ASYMPTOTIC DISTRIBUTION FOR 
CASE I 


We saw in the preceding section that G 
barriers. The points p = 0 and p = l are st 
once one of these points is reached, a sequence o 
there. As a result we anticipated that the asym 
have all its density at p = 0 and 
p= 1. We now prove that this 
is true by showing that the first 
and second asymptotic raw 
moments are equal, that is, 


(L5 Vi = iya 


Once we have demonstrated that 
this is correct it is easy to show 
that the only distribution with 
such moments is a binomial one Fig. 7.1. The asymptotic distribution for 
with density only at the points Case I. All the density is at p = Oandp = l; 
zero and unity. Moreover, the amount at p = 1 is Vi, œ- 
from the definition of the mean 
it must be that the amount of density at p = lis Pas 
We now show that the second raw moment Vo œ equ 
From equation 4.52 and the asymptotic conditions, Vi,n+1 
and V,, = Vo... we have 


and so increases p. 


ase I involves two absorbing 
able points in the sense that, 
f p values forever remains 
ptotic distribution would 


Density 


1 


9 p 


as shown in Fig. 7.1. 
als the mean V4, o 
= Via = Ve 


SR Co + (Cn — DV + Cia = 0: 


The coefficients are obtained from equations 4.53 with ag — 9 and 


a—1l-—o: 
Ce 0, 
(7.7) Cy = 1 — % + Xe; 
Cig = % e 
Substituting these in equation 7.6 gives 


(7.8) (ay — «9 Via = (01 — V 


156 OPERATORS WITH LIMITS ZERO AND UNITY cH. 7 


and, since we are considering only the cases for which a,  «,, we conclude 
(7.9) Vsus = Viu. 

It is an easy matter to prove by induction that all the asymptotic raw 
moinents are equal (see footnote, Section 3.3), but this follows immediately 
once we have established that the distribution is a binomial as shown in 
Fig. 7.1. The definition of the raw moments, equation 4.1, gives for 
m= l and m = 2, for discrete distributions, 

Y= IPP» 
(7.10) M 


We then equate these and get 


or 
(7.11) 2 phl — p,)P, = 0. 


All terms in this sum are nonnegative by definition since 0 < p, <1 
and 0 < P, <1 forall. Thus each term must be zero. This requires 
that P, = 0 except when p, = 0 or p, = 1, because in these special cases 
p. — p,) vanishes instead of P,. In other words, no density exists at 
points other than zero and unity when the first and second raw moments 
of a distribution on the unit interval are equal. This completes the 
proof that the asymptotic distribution for Case I is the simple binomial 
distribution in Fig. 7.1. (This proof for discrete distributions is only 
suggestive for more general ones.) 

The only problem which remains is to determine the one parameter 
Vi, of the asymptotic distribution. Unfortunately it does not appear 
possible to determine this quantity as an elementary function of the 
operator parameters. In the next section we develop some bounds on 
Via and show that it depends upon the initial distribution of p values. 
In Section 7.5 we mention a more elaborate approach for determining 
Vis: 


7.3 BOUNDS ON THE ASYMPTOTIC MEAN FOR 
CASE I 


In the preceding section we proved that all the density finally reaches 
zero or unity. This includes the possibility that all the density reaches 
unity, that is, that Vj, == 1, which in turn would imply that all moments 
about the mean were zero. Such an asymptotic result would be an 
especially simple state of affairs, and the interpretation would be that all 
organisms would ultimately achieve perfect learning. In this section We 


SEC. 7.3 BOUNDS ON THE ASYMPTOTIC MEAN FOR CASE I 157 


demonstrate that this is not the case except when the initial probability 
is already unity or when s = l. We further prove that the asymptotic 
mean depends upon the initial probability pọ except when o, = lora,= 1. 
(The asymptotic distribution theorem given in Section 4.7 does not apply 


to the present problem since a, — 4. = 1) 

To carry out the proofs we need some upper and lower bounds on Vy, æ- 
First we derive an expression for the probability Ps, 4, of an infinite chain 
of Ay responses; such sequences will terminate at p — 0. Since the 
asymptotic mean, V}, «> is identical to the sum of the probabilities of all 
sequences which terminate at unity, V; 4, cannot be greater than 1 — Ps 4. 
This then provides an upper bound for V, ,. A lower bound on Vj. 
is obtained in a similar way. The probability Pi,» of an infinite chain of 
A responses is computed; because such a sequence terminates at p — l; 
Vi o is at least as large as Py... We thus have the 


BOUNDS ON THE ASYMPTOTIC MEAN: 
(1.12) Pro < Vio S1— Pao 


We now proceed to compute these bounds. . 
The probability, Ps, of a chain of precisely responses of type Ag is 


Po, = (1 — pol — Q2 pod — Qzp9 ::: (0 — Q.""Po) 


n—1 


(7.13) = TT — Q2'po)- 
r=0 
From the second of equations 7.3, we have Qs p = ap. and so 
(7.14) Qypo = %2'Po- 
Thus 
n-1 . 
(7.15) Pis Tf (1 = apo): 
v-0 


In the limit when n — oo, we have 


(7.16) Pas = TTU — ag po 
v=0 


Hence, P, „ is a function of two parameters s and po. Itisan example of 


the function 


(7.17) Pia, p) = Ta — a'f) 


which we have computed and present in Table 7.1. 


158 OPERATORS WITH LIMITS ZERO AND UNITY cH. 7 


For some purposes it is convenient to have an approximation to P» o>» 
and this is developed below.* We take the logarithm of both sides of 
equation 7.15: 


n—1 
(7.18) log Po „= > log (1 — apo). 
v=0 
Provided that —1 < «4"py < +1, we may expand each of the logarithm 
terms on the right and change signs (note: log (1 — x) = —a — x*/2 — 
a3/3---): 
n—1l 


—log Pe, -2 lus. + ear s RE Lee j 


" =) > ee (y Po)" 


v=0 u= 


(7.19) 


We then interchange the order of summation and perform the summation 
over v to obtain 
© 
u nu 
(1.20) jen, = En INTO 
s u l—e* 
ucl 


When z, < 1, we have in the limit as n — co, 


(7.21) gam Rd ll 
; ul-—«ee 


ucl 
This sum is difficult to evaluate, but if we replace a " with «s we shall 


certainly cause each term beyond the first to increase, and so we have the 
inequality 


(722) —log P, — >" 
u l— ay 


This sum involves the expansion of —log (1 — pọ), and we see that 


(723) log Pys <8 Po) 
I= e 
It then follows that 


(7.24) Poco > (1 — py) 729. 


* We are grateful to William J. McGill for assistance in carrying out these 
computations. 


SEG. 7.3 BOUNDS ON THE ASYMPTOTIC MEAN FOR CASE I 159 


The approximation used in going from equation 7.21 to inequality 
7.22 is useful since it permitted us to complete the summation and thus 
obtain expression 7.24, which leads to an upper bound on the asymptotic 
mean: 


(7.25) Vat le py. 
This bound already shows that the asymptotic mean cannot be unity 


unless either py = 1 or «a= 1. (When g= 1, we have a special case 


considered in the next chapter.) 
The reader may be curious at this point why we did not obtain an upper 


bound, corresponding to equation 7.25, for the general case when 

(7.26) Qs p = s p + (1 — «3)Às. 

The answer is simply this. The probability Ps, of an infinite sequence of 
As responses is zero unless 4; — 0. Let us refer to equation 3.5, which 
may be written as 

(7.27) Qy'po = «spo + (1 — 3 Yaz, 

where Ay is the asymptote achieved when Q, is applied repeatedly to po. 


First consider pọ < A» Then we observe that QPo > Po and so 
1 = Qypy <1—po. From equation 7.13 we get 


(7.28) Pon < (1 — Po)” 
Since (1 — po)” approaches zero as n becomes infinite, P, ,, is zero unless 


Po= 0. Second, consider po > Ag. Note that Qy'po > ĉe and hence 
that 1 — Qyp, <1 — Že So again from equation 7.13 we have 


(7.29) Pon < (0 — 2a)", 
and so in the limit we have Ps. = 0 unless 4, = 0. Only in the special 
= 0 or Ay = 0 do we obtain a Pz o > 0. The 


cases for which either po à 
d the case of A, = 0 is 


condition py = 0 needs no further discussion, an 
being considered in this chapter. 

We may also compute the probability P;,, 
type A, to obtain another bound on Vi. We have 


of a chain of n responses of 


n-1 
(7.30) Pi in" Qi'po- 
But from equation 3.5 we sce that, for A= 1, 
(7.31) Qi'po = %'Po +(1L— ay) = 1— 93 qos 
and so 


n-l 


(7.32) Bia = Zr (1 — %4'9o)- 


160 OPERATORS WITH LIMITS ZERO AND UNITY CH. 7 
In the limit as n — © we have 
(7.33) Pio [TG — 24°40) = PG. qo), 

r=0 


where P(z,, qo) is the function defined by equation 7.17 and presented in 
Table 7.1. By analogy with our previous development of an approxi- 
mation for Py, we obtain 


(7.34) Pj s l= Gy; 


TABLE 7.1 


The function 


PG.) = " ( — wp). 


p 0 01 02 03 04 05 06 07 08 09 10 


O |1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 
0.1 |0.900 0.890 0.878 0.862 0.841 0.813 0.772 0.709 0.598 0.358 
0.2 |0.800 0.783 0.760 0.733 0.698 0.650 0.586 0.492 0.346 0.120 
0.3 | 0.700 0.677 0.648 0.612 0.568 0.510 0.434 0.331 0.193 0.038 
0.4 | 0.600 0.574 0.541 0.501 0.452 0.390 0.313 0.216 0.102 0.011 
0.5 | 0.500 0.473 0.439 0.398 0.349 0.289 0.217 0.134 0.051 0.003 
0.6 | 0.400 0.373 0.342 0.303 0.258 0.204 0.143 0.079 0.024 0.001 
0.7 | 0.300 0.277 0.249 0.216 0.178 0.134 0.087 0.042 0.010 0.000 
0.8 | 0.200 0.182 0.161 0.137 0.109 0.078 0.047 0.020 0.003 0.000 
0.9 | 0.100 0.090 0.078 0.065 0.050 0.034 0.019 0.007 0.001 0.000 
1.0 0 0 0 0 0 0 0 0 0 0 


coococococ- 


This result yields a lower bound on the asymptotic mean: 
(7.35) Es pom), 

Combining this with inequality 7.25 we have a new pair of 
BOUNDS ON THE ASYMPTOTIC MEAN: 

(7.36) poll! WMV, .<1—(l — po) -22, 


These bounds on the asymptotic mean can now be used to demonstrate 
that V}, depends in general on the initial probability py. A numerical 
example will suffice. Let 2, 2/3, and x, = 3/4, so that the bounds are 


(7.37) poet m EC (T pa 


a. 


SEC. 7.3 BOUNDS ON THE ASYMPTOTIC MEAN FOR CASE I 161 


For p, — 0.1 we have 


(7.38) 0.001 < Vj, < 0.344, 
and for py — 0.9 we have 
(7.39) 0.729 < Vi œ < 0.9999. 


Clearly since V}, , must be below 0.344 when p, — 0.1 but must be above 
0.729 when p, — 0.9, it follows that Vi. must indeed depend upon ps. 
In other words, the proportion of organisms reaching p — 1 depends upon 


the initial response probability. 


10 


0.8 JA 


Expected operator He 
approximation ~~~ 


Trials, n 

-value distributions for the first five trials, 
r ten trials, and the bounds (0.109, 0.508) 
1, Àj = 0, % = 04, 9, = 0.7, 


Fig. 7.2. The correct means of the p 
the expected operator approximation fo 


on the asymptotic mean, Vi,0. The values 4, = 
and p, — 0.2 were used for the illustration. 


The dependence upon po of V, is further substantiated by what 
happens in two trivial cases. We know that if py = 0, the operator Qg- 
is always applied and so the probability forever remains at zero, that is, 


Vis — 0. We also know that when p, = | the operator Q, is always 
applied and the probability remains at unity, that is, Vi, — 1. We 
conclude, therefore, that as p goes from zero to unity, V; œ increases from 


Zero to unity. 

. The reader may have detected that the upper and lower bounds of 
inequalities 7.36 become very near unity and zero, respectively, when 
a, and ot) are near unity, provided that py is not too near zero or unity. 


162 OPERATORS WITH LIMITS ZERO AND UNITY CH. 7 


It should not be inferred from this that the asymptotic mean becomes 
less and less sensitive to py as z, and x; approach unity, however, because 
the bounds of inequalities 7.36 may be far from the correct mean. Those 
bounds were obtained from a consideration of only the sequences of all 
A, responses and all A, responses. As the parameters x, and x» approach 
unity, other sequences become more important, and hence the bounds 
we derived become less sensitive. 

To illustrate how the upper and lower bounds surround the true means, 
we have computed the means of all possible sequences for five trials. 
These are shown in Fig. 7.2 along with the limits computed from in- 
equalities 7.12. Also shown in that figure are means obtained from the 
expected operator approximation discussed in Section 6.4. 


7.4 FURTHER RESTRICTIONS ON CASE I 


We now consider a very special case which is easy to compute. Let 
% = 0 so that the two operators are 


Op=1, 


(7.40) 
Qsp — agp. 


The first asserts that whenever a response 4, occurs the p value jumps 
immediately to unity, where it remains. Thus the only sequence that 
terminates at zero p value is the one containing an infinite chain of A» 
responses. It follows then that 


(7.41) Vio = 1 — Pio 1 — Phas, Po), 


where P(s, po) is the function given in Table 7.1. 
We have another very special case when we take x, = 0 so that the 
operators are 
Qi p = % p+ (1 — o4), 


Qs p — 0. 


This case is complementary to the preceding one; whenever an Ag 
response occurs the probability goes immediately to 2ero and remains 
there. Hence the asymptotic mean V} must equal the probability 
Py. of an infinite chain of A, responses, that is, 


(7.42) 


(7.43) Vio = Pye = P(%, qq). 


We have still another special case when we set % — x, = g. From 
the analysis in Chapter 5 we know that for 4, = 1 and A, = 0 we have 


Fhe = Par 


SEC. 7.5 THE FUNCTIONAL EQUATION FOR CASE I 163 


*7.5 THE FUNCTIONAL EQUATION FOR CASE I 


In Case I the asymptotic distribution has all its density at p — 0 and 
p= l and is therefore described by a single parameter, V; ,, which gives 
the amount of density at p= 1. In the preceding sections we derived 
some bounds on V;,,, and considered the special cases when o, = 0, when 
æa = 0, and when a, = x, — «. Moreover, we demonstrated that Vj 
depends on po, the initial probability. The asymptotic mean, then, is a 
function of 4, «,, and po, so that we write 
(7.44) V, a = 1% 3», Po) 

This function 7 may be considered to be the probability that a “particle” 
beginning at pọ will end up at unity when the operator parameters are 
a, and a. On the first trial either Q, or Qs is applied to py. Suppose 
that Q, is applied. The particle is then at Q,po, and the probability 
that it will be at unity asymptotically is (%1, %2 Q1Po)- Similarly, if 
Q» is applied, the probability that the particle will then go to unity 
asymptotically is 7(%4, %2 Qspo). We know, of course, that Q, is applied 
with probability po and Qe with probability 1 — po. Therefore, we may 
write 

(7.45) N, s, po) = potias #2 Q1Po) + (1 = po) tia» %2 Q2Po)s 

or, using the expressions for the operators, Qı Po and Qs po, given by 
equations 7.3, we write 

(7.46). (o, o Po) = portas s. 1 — 93 + epo) + (1 — po) (1s %25 s Po)- 
This equation is a particular type of functional equation. Its properties 
and solution have been investigated by Shapiro and Bellman [1]. They 
have shown that no simple solution exists in closed form, but they have 
developed a numerical procedure for obtaining values of N. . 

Shapiro and Bellman have also shown that the functional equation has 
some special symmetry properties which result from the symmetry in the 
basic operators Q, and Qs. We may define a complementary function 
Sis, 03, qo) which represents the probability that a. particle beginning at 
Jo = l1 — po will terminate at q = 0 when the operator parameters are 
% and a. By the same arguments as those given above, this function 7 
must satisfy 
jas, a, 1 — t2 + tado) + (1 — Jo) "I, 935 Jo)» 
We see, then, that 7j must satisfy the same functional equation as 7 when 
a and ay are interchanged and when p, and qo = 1 — po are interchanged. 
This property is of considerable importance in the computation scheme 


developed by Shapiro and Bellman. 


(7.47) 7i(%2, o. Jo) = qo 


164 OPERATORS WITH LIMITS ZERO AND UNITY cH. 7 


We have already obtained solutions of the functional equation for 
some special cases. We see from Section 7.3 that, when o, = 1, then 
Vj, = 0, and when a, — 1, Vj 4, = 1l. Hence 


NCl, ta, Po) = 0, 


(7.48) 
Nar 1, Po) = 1. 
770 
l 7 
P4 
^4 
JU 
"d 
à Pa 
y 
ay "d 
8 fe 

< E 3 
SPA m 

LU Ss " 

z E = 

4 
a 
Pd 
a 
A 
7 
7 
7 
0 i n=l- Poo 

0 1 


ay 


Fig. 7.3. Showing some special values of the function 1(%,, %2, Po) in the o, X» 
plane. The value at the upper right-hand corner is indeterminate. 


In Section 7.4 we showed that 


(0, ta, Po) = 1 — Poo = 1 — P(%o, po), 


(7.49) 

nær 9, Po) = Py, = Pl, 1 — po), 
and 
(7.50) 1 (9 & Po) = Po- 


These five special solutions may be summarized if we consider 7 to be a 
function in the a;, « plane with a parameter po. In Fig. 7.3 we show 


such a plot. 
7.6 THE ASYMPTOTIC DISTRIBUTION FOR 
CASE II 
Let us return to Case II defined in Section 7.1, by equations 7.4. Neither 
p= Onorp= lisan absorbing barrier, and so the asymptotic distribution 


is independent of the initial p values. The asymptotic distribution 
depends upon % and 2, of course, and so we investigate this dependence. 


SEC. 7.6 THE ASYMPTOTIC DISTRIBUTION FOR CASE II 165 


First consider the asymptotic mean Vi... From equation 4.52 we have 
as n—> oo, 


(7.51) Cio + (Cu — Wirot Ciao s — 0. 

The coefficients of equations 4.53 are for a, = 0 and a, = (1 — a3) 
Cy = 1—&» 

(7.52) Ga Es Dery =i 
Ciz = 0, — Xo. 


When we insert these coefficients in equation 7.51 and solve for Vy... we 


have 

(7.53) Wai armate 

The second raw moment Vs,» is never negative, and so when a < 1 we 
have 

(7.54) Vio 21/2 for A Š ay. 


n V4, by the methods used in 


We can obtain better bounds than these or r s 
tained from inequality 6.44, 


Chapter 6. The first bound is readily ob 
which can be written 


(7.55) Vae € Vie 


Using this in equation 7.53 gives 


gad Q4 — % 
(7.56) Vien T Æ nA Vas for % NES 


Solving for V; ,, in these last two inequalities we have 


(7.57) LN: : Locum for c, Í Xe 


— (s — % 
The expected operator bound described in Section 6.5 leads to another 
set of bounds, namely, 
EE 
/ = 
(7.58) V. «0—29 — VU — a)l — o3) fob df 
i 1,0 > 


&, — €» 


166 OPERATORS WITH LIMITS ZERO AND UNITY cH. 7 
As a numerical example, let «, = 0.9 and x = 0.6. Relations 7.57 and 
7.58 then give 

(7.59) 0.667 < Vi,» < 0.800. 

Improved bounds may be obtained by the procedures given in Section 6.6. 


The form of the asymptotic distribution may be approximated by 
Monte Carlo computations as described in Section 6.3. We provide an 


example in Fig. 7.4. 


0 0.2 0.4 » 0.6 08 1.0 
Fig. 7.4. The approximate asymptotic cumulative distribution for Case II, dis- 
cussed in Section 7.6, obtained from a 1000-trial stat-rat with x, = 0.8 and 
æ, = 0.5. The mean and standard deviation of the stat-rat p values are 
Vi, = 0.634 and o% = 0.158, respectively. The bounds given by relations 
7.57 and 7.58 in the text give 0.613 < V, «< 0.714. 


7.1 CASES WITH ONLY ONE ABSORBING BARRIER 

In previous sections of this chapter we considered two cases: (I) 49 = 0 
and A, = 1, and (II) 4, = 0 and 2, — 1. For Case I we said that there 
were two absorbing barriers, at p — 0 and p = 1, because once a p value 
reached one of those points it forever remained there. This case of two 
absorbing barriers suggests the possibility of only one such barrier. 
In applications to experimental problems we might like to use an cvent 
operator Q, which permitted perfect learning (4; == 1) but an operator Q» 
which did not lead to complete extinction (4, — 0). Or we might want 
to allow complete extinction but rule out perfect learning. 


sEc. 7.8 THE ASYMPTOTIC DISTRIBUTION 167 


When A, = | absorption at p= 1 may occur, but when 0 < 2, ab- 
sorption at p= O is impossible. We might expect that nearly all 
sequences of p values would eventually reach p — 1. In the next section 
we prove that this is correct. Similarly, when 4, <1 and A,= 0, 
eventual absorption at p — 0 will occur. (From stat-rat computations, 
as described in Section 6.2, we have found that the absorption can be 
extremely slow even for values of 2 and z; as small as 0.8.) 

The case of one absorbing barrier could be used in an attempt to apply 
association theory to our mathematical system [2]. One of the basic 
principles of association theory (or contiguity theory) is that the response 
which occurs last in a stimulus situation is more likely to occur upon the 
next presentation of that situation. Suppose that we consider response 
A, to be "success" and response Ay to be "failure." On each trial 
success or failure terminates the stimulus situation, and so whichever 
occurred should tend to have greater probability of occurrence on the 
next trial, according to the principles of association theory. Hence Q, 
should increase p and Qg should increase q = 1 — p, that is, Qs should 
decrease p. But suppose that we demanded that "success" eventually 
occur on all trials. This could be approximated by letting A= 1 but 
requiring that 4, 40. The latter requirement would mean that an occur- 
rence of failure would usually increase q, the probability of failure, but 
that q could never reach unity; q would be at most 1 — ĉe The “rate of 


learning" would then depend upon the values of 24, s, and Ap. 


*7,8 THE ASYMPTOTIC DISTRIBUTION FOR 

ONE ABSORBING BARRIER 
barrier, all density is ultimately 
he case of 4, = 0 and 4, < 1 in 
2, = 1 follows by a simple 


When there is only one absorbing 
absorbed at that barrier. We consider t 
detail because the proof for 0 < 2s and 
symmetry argument. We have the operators 


(4 < D. 
(xs < 1), 


Qpeep--— CALA 


Qs p — tap 
and Q, is applied with probability p. and Q, is applied with probability 
1 — p. This case is included under the theorem stated in Section 4.7, 
and so we know that the asymptotic distribution is independent of the 
initial p values. But we now prove that the foregoing operators lead to 
an asymptotic distribution that has all its density at p = 0.* A 

From the trapping theorem of Section 4.7 we know that absorption 


(7.60) 


* A similar proof was developed by A. Birnbaum [3]. 


168 OPERATORS WITH LIMITS ZERO AND UNITY cH. 7 


cannot occur at p — 1, that is, all the density of the asymptotic distribution 
is in the range from zero to 4,. Thus, if absorption occurs it is certain 
to occur at p = 0; if p = 0, then Q, p = O and Q, is certain to be applied. 
To prove that absorption at p = 0 will occur with probability one as the 
trial number n becomes infinite, we show that the conditional probability, 
Yn» Of an infinite sequence of A, responses beginning on trial n, given an 
A, response on trial n — 1, is greater than zero. When an A, response 
occurs on trial n and on all future trials we say that the sequence of p 
values is in a state of pre-absorption. The probability that the sequence 
is in this state on trial n we denote by y,, whereas the conditional prob- 
ability of the sequence being in a state of pre-absorption on trial n, given 
that it was not in that state on trial n — 1, is y,. We then see that 


(7.61) Pari = Pn + (1 — Praner 


This recurrence formula has the formal solution, 


(7.62) ya 1 —(0— Yo) TT — yn) 


a'-l 


Clearly, i», —> 1 as n — co, provided that 


(7.63) lim TTO — 94) = 0. 
-1 


No Ql 


This in turn will be true if y,, is greater than some positive quantity € 
for all n’, because one condition that the product tends to zero is that the 
n 
sum of the y,, diverges with n [4]; if y, > e, then X y, > ne, and of 
arl 
course ne diverges. We see at once that 


Yn = (1 — p, — ag PAI — asp, -- 
= -][ü — &gp,). 


In Section 7.3 we encountered such a product for P, „. Analogous to 
inequality 7.24 we have 


(7.64) 


(7.65) Wyse (I — 5, on. 


Provided that p, # 1 for any n, we shall have y, > e for all n. But since 
the limit point 4, is less than unity and since 45 = 0 for the case being 
considered in this proof, we can never have p, — 1 except possibly for 
n — 0. If, however, py = 1, we need only consider the process starting 


SEC. 7.9 SUMMARY 169 


on trial n = 1. Therefore y, is greater than some positive quantity e 
for all n, and, as we previously saw, this implies that y, — 1 as n — oo. 
This means that a p-value sequence will get in a state of pre-absorption 
with probability one, and as a result absorption at p = 0 must eventually 
occur. 

The proof that absorption at p — 1 eventually occurs when we have the 


operators 
Qip = %p + (1 —2) (% < 1), 


(7.66) 
Qs p — “p+ (1 — &3)Às (0 < A3), 


follows immediately by symmetry. We merely interchange p and q = 
1 — p, and the above proof is applicable; eventual absorption occurs at 
q=0." 

7.9 SUMMARY 


In this chapter we discuss cases of the general model for which the 
limits 4, and 2, are zero or unity and the events are subject controlled. 
These cases are of particular interest for applications in which “perfect” 


learning or “complete” extinction or both are possible. 
Case I, discussed in Sections 7.2, 7.3, 7.4, and 7.5, involves the operators 


Q;p-—owp-( — %), 
Qs p = % P- 


Since all sequences tend toward p — 0 or p= l, we call these points 
"absorbing barriers." The probability that a sequence gets arbitrarily 
close to p= 1 depends upon æy, s, and po, as is shown in Section 7.3. 
Bounds on the asymptotic mean are provided. 

Case II, discussed in Section 7.6, involves the operators 


(7.3) 


Qp = % Ps 


74 
i Oo p= as p + (1 — %2)- 


The points p = 0 and p = 1 are “reflecting barriers" because if a sequence 
ever reached one of these points it would be certain to move away from it 
on the next trial. Some bounds on the asymptotic mean for this case 


are provided. f . . 
Cases with only one absorbing barrier are discussed in Sections 7.7 


and 7.8. One of two such cases employs the operators 


Qi p = % p + (1 — 4A (4 — D. 


7.60. 
vM Qp-up ~- (s c 1. 


170 OPERATORS WITH LIMITS ZERO AND UNITY cH. 7 


For this case it is proved that sequences tend toward p — 0 with prob- 
ability one. The other such case arises when the operators are 


Qp = up + (1 — a) (@ < 1), 
Qp = &a p + (0 — de)a (0 <2). 


In this case, sequences tend toward p = 1 with probability one. 


(1.66) 


REFERENCES 


1. Harris, T. E., Bellman, R., Shapiro, H. N. Studies in functional equations occurring 
in decision processes. Research Memorandum RM-878, RAND Corporation, Santa 
Monica, Calif., July 1, 1952. 

2. Guthrie, E. R. The psychology of learning. New York: Harper, 1935. 

3. Birnbaum, A., personal communication. 

4. Bromwich, T. J. l'a. An introduction to the theory of infinite series. London: 
Macmillan, 1942, second edition, pp. 104-106. 


CHAPTER 8 


Commuting Operators 


81 THE COMMUTATIVITY CONDITIONS 


If two operators yield the same result when they are applied in either 
order we say that those operators commute. In this chapter we consider 
cases of the general mathematical system which arise when two operators 
commute with each other. The mathematical system described in the 
first four chapters has already been specialized to yield a number of cases 
which lead to mathematical simplifications and which are of interest in 
various kinds of applications to learning. Experimenter-controlled 
events led to few problems but subject-controlled events led us into 
Because we believe that such events 
are important in experimental problems, most of the last three chapters 
were devoted to their analysis and discussion. This chapter also is 
devoted mainly to subject-controlled events for which the probabilities 
of the two responses on trial n are p, and 1 — Pr 
i The notion of commutativity of two operators was first introduced 
in Section 1.4, but in Section 3.6 we gave a more explicit discussion of the 
commutativity of the operators Q; and Q. Those operators commute 
provided that 


(8.1) Q1Qs p = Q201 P» 


If this condition holds, it can be inferred that success 
et result as failure followed by success, 


if Q, and Q, corresponded to success and failure in an experiment. Thus, 
with commutativity there is no “recency” effect—the more recent event 
does not have a greater effect. The commutativity condition, however, 
has even more far-reaching implications. 
of n events, E, and E;, and that some number k of them are Ej's and n — k 
of them are £,’s. Corresponding to this event sequence will be a sequence 
of operators which leads to a particular final p value. Now if, the event 
operators commute, any re-arrangement of the order of the events will 


171 


serious mathematical difficulties. 


for all values of p. 
followed by failure gives the same n 


Suppose that we have a sequence 


172 COMMUTING OPERATORS cu. 8 


not change the final p value, provided the total number of E,'s and the 
number of E's are kept the same. If the p value on trial zero is po, the p 
value on trial n is 
(8.2) Pua = Qi Q" "po. 
This fact simplifies matters considerably. 
We saw in Section 3.6 that Q, and Q, commute if and only if one or 
more of the following conditions holds: 
(a) 4 =l, 
(8.3) (b) w= l, 
() Aye Aa A 


o8} E M IN pides neam 


0.6 bl — 


p values 


ol — 
0 1 2 3 
Trials, n 
Fig. 8.1. The three possible sequences of p values for three trials when the 


operators commute. The paths ACEF, ACDF, and ABDF show the sequences 
E,E, Ey, E,E,E,, and E,E,E,, respectively. The figure was drawn using equations 
8.4 with 2 = 0.8, p, = 0.1, z, — 0.7, and x, — 0.9. 


Condition a implies that Q, p = p, that is, that Q, is the identity operator 
which does not change p. Similarly, condition b implies that Q, p — pP- 
Condition c says that Q, and Q, have the same limit point A, that is, 
Q,"p and Q,"p approach the same asymptote as n goes to infinity. These 
three conditions, taken together, are equivalent to requiring that the 
operators have the forms 


Qip — pd 0-24, 


8.4 
am Qs p = 3 p d- (W — %3). 


SEC. 8.2 EQUAL LIMIT POINTS OF THE OPERATORS 173 


We see that a, = | and gẹ = 1 are special restrictions that yield operators 
which already satisfy condition c above. It is easily demonstrated that 
these operators commute; by direct computation we see that 


Q10s p = nle p + (1 x3)4] + (1 04)ÀA 


(8.5) 
= as p + (1 — e23)4. 


The reader may easily verify that Q5Q, p gives the same result. Moreover, 
it is a simple matter to generalize this result to k applications of Q, and 
n — k applications of Q, to obtain 

(8.6) Pn, = Q10" "Po = Mea + (1 — ot ata" “*)A. 

In Fig. 8.1 we illustrate the three possible sequences for n = 3 and k = 2. 
These sequences are E,E,E2, E,E,E,, and E,E,E,. Each sequence leads 
to the same value of p, but the paths are different. In the following 
sections we discuss some of the properties of such sequences of p values. 


82 EQUAL LIMIT POINTS OF THE OPERATORS 


When neither operator, Q, nor Qs, is an identity operator they have 
limit points 4; and As, respectively. As we have just seen, Qı and Q: 
commute when A, = 45 — 4. We discuss this condition in this section; 
we require that a and as be strictly less than unity. Such a set of restric- 
tions seems plausible for a number of experimental problems. For 
example, in the Brunswik T-maze experiment first described in the Intro- 
duction, a rat may always be rewarded on the right side and never be 
rewarded on the left side. Both types of events might increase p, the 
probability of turning right, towards a limit point of unity, as assumed 
in the Introduction; but, contrary to an assumption made there, the 


magnitude of the effects of the two events may be different. Thus, for 
— À, = A, but a, Æ% Moreover, we 


this problem, we might have A 
may not wish to assume that 4 = 1. 

Another example for which equal limit points would s 
is extinction of an operant response, such as bar pressing. — 
Operant response occurs and is not reinforced, "inhibition" might result, 
and so the probability p of that response should decrease. Furthermore, 
When other responses occur, we might expect them to increas 
ability even if they are not rewarded by the experimenter. Therefore, 
the occurrence of both the unreinforced operant response and of other 
responses might tend to decrease the probability of the operant response 
lowards an asymptote of zero. In such a situation we would have 


0. We would not want to assume that the two types of 
and so we would not set 


ld seem reasonable 
When the 


e in prob- 


Aem m s 
events had effects that were of equal magnitude 


2 equal to as. 


174 COMMUTING OPERATORS cH. 8 


The asymptotic distribution of p values is especially simple when we 
have equal limit points. All the density is at the point 2 = h= Ag: 
Intuitively this is clear since each operator moves p towards 4. Moreover, 
it may be seen that the expression in equation 8.6 tends towards Aasn 
tends to infinity because both a, and æ, are less than unity, by assumption. 
Our chief interest, then, is in the distributions of p values during the early 


1.0 | 


Probability 


0 2 4 6 8 10 12 14 #16 18 20 
Trials, n 


Fig. 8.2. The limiting p values Q,"p, and Q;"p, for the case of equal limit points. The 
computations were made with 2 = 0.8, py = 0.1, «, = 0.7, and x, = 0.9. The middle 
curve shows the approximate means V1,» computed from equation 8.8. 


part of the learning process. First we consider the problem of computing 
the distribution means from trial to trial and then we provide some 
approximate formulas for computing the expected cumulative number of 
occurrences of each response. 

The recurrence formula for the means is obtained from equations 4.52 
and 4.53 with the relations a; = (1 — «;)À for i = 1, 2. We have 
(8.7) Vina = 0— a3)À + [xg — (f — *3)2] Vi, (Oy — o) Vos 
This formula has not been materially simplified by the equal-A assumption 
because d, ~ x», So we use the approximate methods given in Chapter 6. 
To begin, we note that the distribution of p values on trial n is contained 
entirely between Q;"po and Q,"p,. In Fig. 8.2 we illustrate these extreme 


SEC. 8.2 EQUAL LIMIT POINTS OF THE OPERATORS 175 


sequence bounds computed with the aid of equation 3.5. The upper and 
lower bounds developed in Section 6.7 can be used to obtain limits on the 
means V, „, but we use the expected operator approximation described in 
Section 6.4. When the difference between a; and a, is not large, the 
limits Q,"py and Q;"p, are fairly close together. The means computed 
from the expected operator lie between those limits, and so we can expect 
the error to be small. Using the conditions a; — (1 — x;)JÀ, we obtain 
from equations 6.11 and 6.12 the 


APPROXIMATE EXPLICIT FORMULA FOR THE MEANS: 


(Po — Je" + (Po — E 
(po — 19e" — (po — 2)" 


69 Aso ror ap 


where 
1 — «s 


M Oy — op” 
(8.9) 
p= (1 —%) — (o — 93), 
and where we have taken V;,9 = po In Fig. 8.2 we give an illustration 
of the use of this approximate formula. 
If 4 — 0 we have two "extinction" operators. 
the preceding equations give 


With this condition 


HPo 
(8.10) Vn = Fa — po , 
with 
1 — a 
(8.11) p= inmi P cm. 


The expected total number of 4; occurrences as "~~ oo, for example, 
the total number of responses in extinction, can be estimated by the area 
under the curve of V, „ versus n. We know that V;,, must lie between 
Q;"p, and Q;"py, and so limits on the area under the curve of V, are 


3 m Po 
(8.12) Qr Po PM Do I 


n= 


and 


< "EUN" i) 
(8.13) X 0-1 


176 COMMUTING OPERATORS cH. 8 


Furthermore, an approximation to the area under the curve is obtained 
from equation 8.10: 


(8.14) [ Vin maf EN. a 
i 0 Po + (H — pg)e^" 


This is readily integrated to yield (logarithms are to the base e) 
AN ESTIMATE OF THE EXPECTED TOTAL NUMBER OF A, RESPONSES (A = 0): 


———— ^ 
P n 


(8.15) = 
—log (1 aS po) 
l—« 


As a numerical example, consider the values 2, = 0.95, % = 0.75, 
and pọ = 1.00. We then obtain for the integral 


(8.16) E(T,) = 5 log 5 = 8.05. 


The limits given by equations 8.12 and 8.13 are 


(8.17) em, Pew, 

| — 0 1— ao 
We made one hundred stat-rat runs of 25 trials each, as described in 
Section 6.2. and obtained a mean number of 8.99 A, responses with a 
standard deviation of 3.1. Hence the standard deviation of the estimate 
of the mean, 8.99, is about 0.31. Whereas the integral underestimates 
the mean number of A, responses in this problem, the result may be close 
enough for some purposes. 

The preceding approximations were derived for the case of A = 0. 
When we have the complementary case of 2 — 1, similar results may be 
obtained. We need merely to replace py with qo, Vi n with (1 — Vis) 
and interchange x, and x, to obtain as 


AN ESTIMATE OF THE EXPECTED TOTAL NUMBER OF 4, RESPONSES (A = !)! 


PEER 
i ]—95 
(8.18) ET) ~f ee eee, f, 
0 o — X4 


We find this equation useful in Chapter 11. It will be interpreted as the 
total number of errors prior to "perfect" learning. 


SEC. 8.3 THE FIRST OCCURRENCE OF RESPONSE 4, 177 
When p, — 1, as well as 4 — 0, equation 8.15 reduces to 


—log [(1 — %)/(1 — «3) 
(8.19) KT) es —log [U = D/U — «31 
94 — he 

Similarly, when qo = 1, equation 8.18 for 4 = | reduces to 
—log [(1 — z3)/(1 — 23)] 

Oe — 0 ` 
The right side of each of the preceding two equations is an example of a 
function, 


(8.20) ET) = 


—log [(1 — %)/(1 — A) 


(8.21) T(a, B) «cf 
which is symmetric in « and f) that is, 
(8.22) T(x, B) = T(P, œ.) 


This function arises in Part I]. We present its values for various values 
of « and fj in Table B at the end of the book. 


8.3 THE FIRST OCCURRENCE OF RESPONSE 4, 
total number of 


In the preceding section we considered the expected 
ases when 2 = 0 


occurrences of one response or the other in the special c 
and 4= 1. In this section we consider another property of the sequences 
of p values—the mean number of trials prior to the first occurrence of one 
response or the other. Suppose that “failure” in an experiment increases 
the probability of “success” and that we wish to know how many trials, 
on the average, are required before the first success occurs. In the 
Solomon-Wynne experiment on avoidance training (first discussed in the 
Introduction) we might like to know about the mean trial of the first 
avoidance. 

The mathematical problem is a rather simple one in the theory of runs. 
In Section 4.8 we developed a framework for discussing such questions, 
and so we rely upon those results. We consider only the case of Ag = 1, 
giving the operator 
(8.23) Qs p = xs p + (1 — %2), 


and compute the mean number of trials before the first A, occurrence. 
(We make no assumptions about the value of 4; in this section.) If an 
A, response occurs on trial 0, then there will be Æg runs of length 
1,2, 3,---. The mean length of this run of A;'sis obtained from equation 


4.90: 
(8.24) E(Ry4) = AT 0o 


178 COMMUTING OPERATORS cH. 8 


where we have used the abbreviation of equation 4.92: 


(8.25) Tow = || 9x, Too = |. 


K=1 


The qg are the probabilities of response A, for trials K = 1,2,-- - prior 
to the first A, occurrence. For the operator of equation 8.23, 


(8.26) qx = 428 qo. 
Using this expression for qx in the previous equation gives 
Tow = [Dh = (%29o)(%2"qo) * + * (eg'qo) 


= ay DS 


(8.27) 


Hence the mean number of A,’s before the first A,, given an A, on trial 0, is 


2 


e 
(8.28) E(Rgo) = X as Pg, 
»=0 
We encountered this sum in Section 4.8 and considered it a special case of 
a function ®(a, 9) defined by 
o 
(8.29) O(a, B) = X geg, 
»=0 


This function is shown in Table A at the end of the book. In these terms 
we have 

(8.30) E(Ro,9) = Gus, qo). 

This equation gives the mean number of A,’s under the condition that an 


Ag occurs on trial 0. The probability that an A, occurs on trial 0 is qo, 
and so the unconditional mean number of trials, E(F)), before the first Ay 


occurrence is 
(8.31) E(F3) = qotb(zs, qo). 
For given values of % and qy we can compute the value of E(F;) from 


Table A. 
In Chapter 11 we are also interested in the variance of the number of 


trials before the first A, occurrence. Equation 4.91 leads to the result 
(8.32) 0(Fy) = 2qo V (s, Jo) + qoD (o, qo) — [qoP(%2, qo), 
where the function (a, qo) is an example of the function 
(8.33) Wa, p) — 3 poh? Dig». 
v=1 


which is also given in Table A. 


SEC. 8.4 THE SECOND OCCURRENCE OF RESPONSE 4, 179 


As a numerical example of the use of the preceding equations, let 
æg = 0.94 and qo = 1.00. From Table A we get (0.94, 1.00) = 5.0776 
and (0.94, 1.00) = 13.7904. These values then give E(F,) = 5.0776 
and o°(F,) = 6.8764. 


84 THE SECOND OCCURRENCE OF RESPONSE 4, 

In the preceding section we considered the mean number of trials 
before the first occurrence of response A, when 4, = 1, that is, when Q, 
has the form 
(8.34) Qsp = vs p + (1 — «3. 

We also presented a formula for the variance of the number of trials 
before the first 4, occurrence. We now consider the distribution of the 
number of trials before the second A, occurrence when Q, is as given above 
and 

(8.35) Qip— «p ü-— 03), 

that is, when A, = 2, = 1. 

Let S, be a random variable denoting the number of trials before the 
second A, occurrence. The probability distribution of this random 
variable can be computed and its mean obtained; we omit the derivation 
but present the results. The distribution is given by 


(8.36) 


Pr(S, = y= a, 70-929 0 -1(1 _ ata” tqo) | 


1—e «— E. | 
E i ee 
l—a Og — Oy 
When a, = a = g, this expression is indeterminate, but it can be shown 
that 


1—« 
(8.37) Pr(S, = v) = a D*-9g 7 (1 — go) 


1—« 


— s). 


The expected value of $; when a, z^ «s is given by 


a [ 4 m | oco» 


]—o, ay — % 


Eyed d RET 
(8.38) diis i N 
4 [a E Ed ds, 29)» 

Xe — 04 — 0. 


where ® is the function given in Table A. When % = 93 — % this 


expression is replaced by 


1 —« 
(8.39) E($) — —; 4 PE s 40) qo, qo). 


where Y is the other function given in Table A. 


180 COMMUTING OPERATORS cH. 8 


8.5 CASES WITH ONE IDENTITY OPERATOR 
In Section 8.1 we pointed out that the two operators Q, and Q, commute 
either if they have the same limit point 4 or if one of those operators is 
the identity operator. We now consider the latter case with Q, as the 
identity operator. We then have 


Qip =% p (0 — 9) 
Qs p — p. 


The operator Q, does not change p and so it does not have a unique limit 
point. The event Es, associated with Q., has no influence on the response 
probabilities. It may be difficult to envisage such an event in a strict 
sense, but, if E, is believed to have a relatively small effect on subsequent 
behavior, it may be convenient to assume Q, p = p as an approximation. 
In Part II we have occasion to make this assumption. 

Obviously, nearly all sequences tend towards the limit point 4, and 
so the limiting distribution has all its density at the point 4. This may be 
shown by an appeal to the law of large numbers [1]. The only exception 
occurs when py = 0; in this case operator Q, is never applied, and so the 
probability remains at zero forever. 

The recurrence formula for the mean V, ,,, is 


(8.41) Vine {1 + Xd 23)] Vis ad 924) Va ns 


We know that the distribution is contained between p, and 2 on all trials, 
and so analogous to inequality 6.44 we have 


(8.42) Va, S (E po Vi, — Por 

Therefore, after using this inequality in equation 8.41, we get 

(843) Visi Z [E + AO. — Van — 0 = UA Po) Van — Apols 
or 

(8.44) Aaa SEXT = 23)Àpo + [E — (1 — ae )polMan- 


We may obtain a lower bound on the mean V,,,, by solving this expression 
with the equality sign. The result is 


(8.40) 


(8.45) Vin 2: À — (A — poll — (1 — p" 
An upper bound, Y,, may be found from the expected operator pro- 
cedure. Equation 6.69 with dy = 0, % = 1, a; = (1 — 04)4. gives 
(8.46) Y,a = I + A0 — 2] Y, — (1 — a) Yp 
Restriction 6.70 becomes 
(8.47) 14+ A0 — a) — 201 — a) Y, = 0. 


SEC. 8.5 CASES WITH ONE IDENTITY OPERATOR 181 


The quadratic difference equation is troublesome, but the corresponding 
differential equation 


1Y, 
(8.48) at Y(0 —a)yà— Y) 
dn 
has the solution 
po^ " 
(8.49) Y. cx 


Pot (4.— poe 

Thus we have approximately 

(8.50) [NE p. 

; Pot (A — pole 
A further special case of one identity operator obtains when the limit 

of the other operator is unity. This special case has been studied in 

detail by G. A. Miller and W. J. McGill [2]. From a different formulation, 


they obtain the very useful recurrence relation 
(8.51) Van = po + (1 — pot — e) Via (@ = 1,4 = 1). 


This equation is exact and may be used for computing the mean on every 
trial. It does not involve higher moments than the mean, as did our 
previous recurrence formula, 8.41. In Fig. 8.3 we show an illustration 


ZH an" 


0.20 
T 
0 1 a= (Rake = 


0 4 8 12 16 20 24 2 
s Trials. n P 
Fig. 8.3. Distribution means, Vı,n, Versus trials for the case of one identity operator 
(Q4. The lower curve shows the exact means computed from equation 8.51, with 
Po = 0.230 and a, = 0.860. The upper curve shows the approximation given by 
equation 8.49 for å = L 


ompute an average learning curve. 


of how this equation can be used to c 
It of using the differential equation 


For comparison, we also show the resu 
approximation (8.49) with 2 = 1. 


182 COMMUTING OPERATORS cH. 8 


From equations 8.41 and 8.51 we can obtain for the variance on the 
nth trial 


(8.52) e, = VL — Vag) + 


[1 — (1 — as" XL — po) Vin — Po 


l-a 


Hence the variance on each trial may be found in terms of the mean on 
that trial. For 0 <% < 1, o,? approaches zero as n becomes large. 

The distribution of the number of trials, F}, before the first 4; occur- 
rence is especially simple when Qs p = p. Itis 


(8.53) Pr(F, = v) = qd — qo). 
The expected value is 
(8.54) E(Fy) = ZU = v} = qolPo 


and the variance is 
(8.55) o%(F,) = E(F2) — LEG) = qo/po*. 


For example, if py — 0.5, then E(Fj) — 1 and o°(F,)= 2. From one 
hundred stat-rat computations (see Section 6.2) we observed a mean value 
of 1.17 trials before the first A, occurrence. The variance of this estimate 
from one hundred stat-rats should be o?(F,)/100 = 0.02, and so the 
standard deviation is V/0.02 = 0.14. Thus, the stat-rat mean of 1.17 
is a little more than one c above the true mean of 1.00. 

The distribution of the number of trials S, before the second A, occur- 
rence is also simplified when Q,p= p and 4=1. Equation 8.36 
becomes, when «s = 1, 


ee 
(8.56) Pr{Sy = v) = qo" — qo — 2340) m 


" 
f 


Oty” 
The mean of this distribution may be obtained from equation 8.38 by 
noting that 


Si rae, 
(8.57) (1, B) z Tot 


Some algebra leads to the simplified result 
1 — qe 


ES) = 9 —. 
(8.58) (59 = 1 — 4X1 agd 


Furthermore, the variance can be computed without too much difficulty. 


The result is 
qo 9409 


(8.59) eX) (1 — qo)” T (1— oo” 


SEC. 8.6 EXPERIMENTER-SUBJECT-CONTROLLED EVENTS 183 


When qo = 0.5 and a, = 0.9 we have E(Sj) = 2.818 and o*(S,) = 3.488. 
From the one hundred stat-rats mentioned above we obtain a mean 
5, = 3.27; this is about 2.40 above the true mean 2.818. 


8.6 EXPERIMENTER-SUBJECT-CONTROLLED EVENTS 
WITH IDENTITY OPERATORS 
In this section we consider a problem of interest in the applications 
described in Chapter 13. Assume that there are two responses, A, and A45, 
and two outcomes, O} and O,. For the events formed from Og, we assume 
identity operators, and for the other two events we choose the limit points 
appropriate for perfect learning. The operators are then defined by 


Oy p=%p + (I — a) 


(8.60) QP =P 
On p= %P 
Qso p = Ps 


where the subscripts on the operators refer to the response and outcome, 
respectively. This is a specialization of the problem discussed in Section 
5.9. We have taken the «’s which correspond to 0, equal to one another, 
and so only one parameter, 4, remains. From equation 5.79 we see that 
the recurrence formula is 

(6) — Via = [H (i m9 — IVa — (n — 9X — Vae 
d so we consider the 
by Vi, and get 


The annoying second moment term remains, an 
expected operator approximation; we replace Von 


(8.62) V, aa = Vyn + (1 — 9X1 — e) Viu — Van) 


This difference equation is awkward to solve but the corres 
differential equation, 


dV, 
dn 


ponding 


=> (m Ta)(1 04) Vial 1 Vins 


(8.63) 
has the solution 


(8.64) Vio 


Vae r 
ve Vio += Vio) fei 
From this result, we see that 


Trla) * 


1 when 7, > 72 
(8.65) Face 0 when 7, < Ta 
V, , when 7 = T2 


Only the equal ~ result is exact. 


184 COMMUTING OPERATORS cH. 8 


8.7 SUMMARY 


The two operators Q, and Q, commute—yield the same result when 
applied to p in either order—if they have the same limit point 2 or if one 
of the operators leaves p unchanged. When the events are represented by 
operators which commute, the order of the events in a sequence with a 
fixed number of each event does not affect the final p value. This fact 
simplifies computations somewhat and, as we see in Part II, considerably 
simplifies the task of estimating parameters from experimental data. 

In Section 8.2 we present an approximate formula for the means Vin 
as a function of n when the operators have the same limit point 4. We 
then obtain some estimates of the mean total number of A, responses 
when 4=0 and the mean number of A, responses when 4— I. In 
Sections 8.3 and 8.4 we develop expressions for the mean number of 
trials before the first and second A, responses when 4 — 1. We use some 
of these properties in Chapter 11, where we analyze an experiment on 
avoidance training. 

In Section 8.5 we discuss the special case of x) — 1, that is, of Qs p — p. 
The previous results for equal limit points are applied to this case, and 
appreciable simplifications result. This analysis will be applied in Chapter 
10 to an experiment on verbal learning. In Section 8.6 we extend the 
identity-operator condition to experimenter-subject-controlled events. 


REFERENCES 
1. Feller, W. An introduction to 
Wiley, 1950, p. 141. 
2. Miller, G. A., and McGill, W. J. 
Psychometrika, 1952, 17, 369-396, 


probability theory and its applications. New York: 


A statistical description of verbal learning. 


PART II 


APPLICATIONS 


CHAPTER 9 


Identification and Estimation 


91 THE IDENTIFICATION PROBLEM 


apters we have presented a mathematical 


System intended to be useful for some, but by no means all, learning 
problems. In addition, there are some computational schemes and dis- 
Cussions of special cases. Although we have had applications to psycho- 
logical problems in mind throughout, the organization up to this point 
has been guided by the mathematical analysis. In a sense, all we have 
given the reader so far is a mathematical structure. Although the basic 
features of that mathematics were described in more or less psychological 
terms—responses, stimulus elements, trials, events, etc.—there exists no 
essential connection between the system presented so far and the experi- 
mental world. 

The first step in applying the system t 
the basic elements to actual behavior and events. We must unambiguously 
identify system symbols with observables. For the most part we identify 
elements and quantities of the system with things which are defined 
experimentally. A response class in the system is identified with a class 
of behavior observed and recorded by an experimenter such as "turning 
left" in a maze or "pressing the bar" in a Skinner box. We do not 
identify system responses with more microscopic behavior such as à 
muscle flexion (Guthrie's movements") unless those bits of behavior 
are being observed and recorded. We do not mean to imply that we 
consider studies of such behavior unimportant, or that a complete theory 
of learning can ignore such matters. We have a much more restricted 
goal; we wish to describe experimental data, and so we usually must 
accept the experimenters’ definitions of responses. In this sense, the 
system is a descriptive theory rather than a new psychological theory of 
learning. As is pointed out in the next section, this mathematical system, 
in its most general form, can be made compatible, we believe, with several 


current learning theories. 


In the preceding eight ch 


o experimental data is to relate 


187 


188 IDENTIFICATION AND ESTIMATION cu. 9 


A trial has been defined as an opportunity for choosing among a set of 
mutually exclusive and exhaustive alternatives or responses. In many 
problems the identifications of system responses with certain experimental 
behavior classes automatically identify system trials with experimental 
trials. How microscopic the system is depends on the choice of the unit 
which is to be called a trial. For example, each time a rat is placed at 
the starting end of a T-maze and is allowed to Pass a choice point we have 
one experimental trial, and this will be identified with a trial in the system. 
But in some experiments the correspondence is not so obvious. In a 
runway, for example, an experimental trial is defined so that the rat 
always leaves the starting box. If we were to identify a system response 
with "leaving the starting box on an experimental trial," no choice is 
involved; there is but one Tesponse, and its probability of occurrence 
during an experimental trial is always unity (unless starting boxes are 
used as coffins). In runway experiments, we are concerned with changes 
in latency, that is, changes in the time which a rat spends in the starting 
box before going into the runway. A reasonable identification is between 
System response and “leaving the starting box during a small time interval." 
During each second (to pick a specific unit of time) the rat may either 
leave the box (response 4) or not leave the box (response 45). In this 
Case a system trial is identified with a time increment of one second. In 
Chapter 14 we have more to Say about these time problems. At this 
point we wish only to emphasize that the appropriate definition of a trial 
depends upon the identifications we make between system responses and 
experimental responses. 

Learning is represented by orderly changes resulting from the occur- 
rences of events. Again, the notion of an event is an abstract concept in 
the system, but it is necessary to identify these events with empirical 
events. As was suggested when the concept was introduced in Chapter 1, 
the events are identified with such things as stimulus changes, giving a 
reward or punishment, or actual response occurrences, depending upon 
the problem. In general, whenever an experimenter manipulates a 
subject's environment in a specified way, an event has occurred. Further- 
more, the subject may change its own environment by making a particular 
response, and again this constitutes an event. In a somewhat degenerate 
sense, an event is Sometimes associated with no environmental change; 
this will often be mathematically convenient, as is seen in later chapters. 
We have no simple formula for identifying system events with experimental 
events, but in each application we try to make the identifications as 
explicit as possible. Intuition is an important guide. If we do have a 
general principle, it is simply that any class of empirical events suspected 
of systematically changing the subjects behavior in an experiment should 


SEC. 9.2 REINFORCEMENT THEORY VERSUS CONTIGUITY THEORY 189 


be identified with an event in the model. The model, of course, helps us 
to cut down quickly the total number of variables involved, and helps 
categorize the possibilities. (The reader may feel critical at this point 
because intuition is not ordinarily regarded as an important scientific 
principle. The word "intuition" has been used because we are trying 
to make the difficulty overt, rather than conceal it beneath such phrases 
as "natural and obvious choices.") 

The utility of the general model depends upon both the formal structure 
of the mathematical system and the appropriateness of the identifications 
made. It could turn out that the general model might be quite adequate 
with one set of identifications but wholly inadequate with another set. On 
the other hand, it might be that no set of identifications would lead to 
reasonable agreement between the model and data, in which case we 
would be forced to discard the basic mathematical framework as un- 
Satisfactory for the problem. For example, the basic assumption of 
linear Operators may be untenable for many learning phenomena. But 
before altering such a basic assumption, and thereby introducing major 
mathematical difficulties, we would search for a set of identifications 
which would lead to better agreement between the models and experi- 
mental data. - 

It is also possible that more th 
to good agreement. For example. 


an one set of identifications would lead 
it is conceivable that in a two-choice 


situation, say choosing "right" or “eft,” some people would behave as if 
the response were right or left, whereas for others "same" or “opposite” 
of previous trial would be the appropriate identifications of responses. 


9.2 REINFORCEMENT THEORY VERSUS CONTIGUITY 
THEORY 


Though psychologically trained readers are well acquainted with the 
material in this section, we include it for the benefit of others. 


Two of the major psychological theories of learning differ in a funda- 


Mental respect. On the one hand, reinforcement theory stems from 
iving organisms seek to 


Thorndike's law of effect [1], which assumes that ] 
achieve or experience certain kinds of environmental changes and try to 
avoid certain other "noxious" stimuli. All stimuli are assumed to 
Possess intrinsic properties which make them rewarding or punishing to a 
given organism; certain basic needs or drives are postulated from extensive 
observations of animal and human behavior. and reduction of those 
drives is assumed to lead to learning. Contiguity theory, on the other 
hand, postulates that the basic determinant of learning is association. 
Stimuli and responses become “connected” or associated merely by their 
being contiguous in time. The connections are viewed as being 


190 IDENTIFICATION AND ESTIMATION cH. 9 


continuously changing, but when an organism's environment is changed 
the connections between the stimuli perceived just prior to the change and 
the responses just made are preserved. When the organism next en- 
counters a stimulus situation, the response last made in that situation 
will most likely occur. In contiguity theory, reinforcement is considered 
to be only a stimulus change which preserves the stimulus-response 
connections that existed just priorto the change. The distinction between 
reward and punishment, in the operational sense of the terms, is made by 
distinguishing between the kinds of responses which become associated 
to the stimuli present; punishment is viewed as preserving connections 
between the stimuli and withdrawal responses, that is, responses which 
cause the cessation of pain. 

From a psychological point of view, the difference between reinforce- 
ment theory and contiguity theory is important. Hull's behavior system 
[2] along with Spence's refinements and extensions [3] are based upon the 


reinforcement concept, whereas Guthrie is the outstandin 


g proponent of 
association theory [4, 5]. 


These learning theories have led to various 
lines of investigation, both theoretical and experimental. In some cases, 
differential predictions have been made and crucial experiments have re- 
sulted. Indeed this has been and continues to be a healthy state of 
affairs in the development of. Psychological thinking. 

In presenting the mathematical system there is no need tot 
position on the issue of reinforcement versus contiguity. In developing 
the mathematical framework little or no reference is made to either set of 
concepts. An event occurrence may be either a drive reduction, as Hull 
might have said, or a stimulus change, as Guthrie might prefer. Both 
schools of thought would agree that an event has occurred and that this 


event has a definite effect on future behavior in the stimulus situation in 
which it occurred. The major exce 


ake a definite 


\ ! ption to this position of impartiality 
is contained in Chapter 2, where a set-theoretic model of conditioning, 
based primarily on association theory, is presented, This set-theoretic 
model, originally developed by Estes, is not an essential part of our 
mathematical system, however. As pointed out in Chapter 2, it was 
given only as an illustration of how the basic operators could be derived 
from more primitive assumptions in stimulus-response theory. 

In principle, then, our mathematical model is not committed to either 
reinforcement theory or contiguity theory. In this sense it is more general 
than either of those theories! In a more important sense, however, it is 
much less general. The assumption of linear operators, the identifications 
we will make, and the assumed relations between the “true” probabilities 
and experimental measures of behavior all make the model much more 
restricted than the current learning theories. And it Will be still more 


SEC. 9.3 EXPERIMENTAL VARIABLES ' 191 


restricted when we make special assumptions about the parameters 4; 
and «; in applying the model to experimental problems. In making 
these special assumptions we shall have an opportunity to be influenced 
by the several learning theories. In some cases the special assumptions 
will be dictated by considerations such as symmetry (in experimental 
situations), the fact that people do learn short lists of words perfectly, 
the fact that pigeons do extinguish on a pecking response, etc. In other 
cases, however, we may have a choice. For example, if we were trying 
to describe extinction of an operant response Ay, We could assume that 
the operator Q,, which is applied when A, occurs, decreases the probability 
p of A, to zero (4, = 0, a #180 Qi p =% p) and that the operator Qs» 
which is applied when other responses Ag occur, does not changep (A, = 0, 
«a= | so Q,p= p). These assumptions would correspond most 
closely to the Hullian inhibition theory of extinction. On the other hand, 
we could argue that Q, does not appreciably change p (A, = 0,04 = 1 so 
Qı p = p) but that Qs decreases p to zero (Ap = 0, «5 # 180 Qs p = 9e p 
These assumptions would be consistent with the Guthrian concept that 
extinction occurs because other responses become associated with the 
stimuli, The mathematical model permits either extreme view of the 


extinction process or any combination. 


9.3 EXPERIMENTAL VARIABLES 
and human learning have been con- 
including amount of reward, strength 


of drive, amount of work required in making the response, time interval 
between response and reward, and the intensity and duration of an electric 
Shock. Our model does not explicitly involve variables of this kind. 
To be sure, a complete theory of learning would attempt to handle these 
variables in an explicit and unambiguous Way. But our program is à 
more provisional one. The model uses linear operators which introduce 
parameters such as 2; and &;. These parameters must depend upon the 
experimental variables listed above and probably upon still others. 
However, in this work the parameters 2, and % which remain after special 
assumptions are made in each problem must be estimated from the data. 
Hence, the relation of o, for example, to the amount of reward given is an 
empirical question. In an analogous Way, Ohm’s law relates current, 
voltage, and resistance. But it is still an engineering problem to determine 
the resistance for any particular object. 

In a sense the model provides the ex 
statistics that summarize some kinds of data. 
model and procedures for estimating paramet 
studies can then determine how the parameters 


Experimenters studying animal 
cerned with a number of variables, 


perimental psychologist with 
We present a mathematical 
ers from data. Parametric 
depend upon various 


192 IDENTIFICATION AND ESTIMATION CH. 9 


experimental variables. One difficulty which has been evident in the 
interpretation of learning data from various experiments is the lack of 
obvious and useful summary statistics. How do we measure "speed of 
learning" from a sequence of right and left turns in a T-maze, for example? 
Or how does an experimenter summarize in one or two statistics how 
rapidly an operant response extinguishes? The measures which have been 
used, such as time or trials to a fixed criterion of conditioning or extinction, 
have serious shortcomings in view of the evident (to us) statistical and 
sequentially dependent nature of the Processes. Extinction of bar 
pressing, for instance, is often defined as "complete" by an experimenter 
when no responses occur during a predetermined time interval. Yet he 
may find that one animal which has met the criterion will respond very 
rapidly a short time later whereas another animal may not. Time to a 
criterion may be quite sensitive to the time interval used in defining the 
criterion, and a comparison of data obtained under different conditions 
is difficult. While agreeing that such measures are useful, we feel that 
more stable and more psychologically meaningful statistics are needed 
and that we are not likely to discover them without a model. During 
the remainder of this book our chief concern is with estimating parameters 


from experimental data and with comparing these data with implications 
of the model. 


9.4 THE ESTIMATION PROBLEM 

We are now faced with the rather technical st 
to estimate parameters of the model from 
are never strictly determined from 
intrinsically a statistical one. 
mating parameters, A group of subjects are co 


atistical question of how 
actual data. The parameters. 


nd worse ways of esti- 
nsidered to represent a 


The probability of a response on a p 
determined from the dat E a 
probability changes cannot be determined ex 
of statistical inference. It is desired to obtai 
meters from data. A simple analogy is the 


ts we estimate the Probability of a 
of identical Coins, we can flip all of 
c btain an estimate of the probability. 
The procedure used in polling is another example. A population para- 


meter, for example, proportion of people unemployed, is estimated from 
a sample of people. 


SEC. 9.4 THE ESTIMATION PROBLEM 193 


The estimation problem in the learning models is usually more compli- 
cated than those involved in the foregoing examples. In fact, it might as 
well be admitted that the estimation process becomes quite forbidding in 
Some situations. We have a double sampling process as we now indicate. 
For two subject-controlled events there are 2" possible p values on trial n, 
since there are 2" possible sequences of Aj's and A,’s. In general, these 
2" values of probability can all be different. In a group of k identical 
subjects the ith subject has a particular probability value p;, on trial n 
(i= 1,2,---, k). (Identical subjects have the same values of Aj, aj 
and initial probability py.) These k values pin constitute a sample from 
the population of 2" possible p values. Chapters 3 and 4 dealt with the 
Properties of such populations, but we must now consider in detail 
Various properties of samples drawn from these populations. Suppose 
that the sample of & probability values is a random sample. It will have 
à mean value denoted by p,. The population mean V4, can be estimated 
by the sample mean p,. Similarly, higher moments of the sample, such as 
its variance, can be used to estimate the corresponding moments of the 
Population of probability values. The estimation procedure would be 
as straightforward as this if we knew the k probabilities Pin- But, sad to 
State, they are not known. They are true probabilities and must in turn 
be estimated from the data just as we would estimate a single probability 
P ofa head when a coin is flipped. The double estimation process is NOW 
evident—the k probability values p;, are estimated and then used to obtain 
estimates of properties of the p-value population. An analogy may help 
to make the point clearer. Consider a large population of coins, most 
of which are not “true,” that is, they have a distribution of probabilities 
of coming up heads when flipped. (These probabilities correspond to p 
values.) It is desired to estimate the properties of this distribution; so 
first select a sample of k coins. Flip each coin a number of times to get 
an estimate of the probability for each coin in the sample. Then use these 
individual coin estimates to infer properties of the population of coins 
from which the sample was drawn. d as fli 

The probability p,, for the ith subject on trial n may be regarded as th 
Parameter of a binomial distribution of a random variable x, This 
random variable 2, has the value 1 if response A, occurs on trial n for the 
ith subject, and it has the value 0 if an Ag occurs. The probability that 
Yin = 1 is p, and the probability that t;n = Ois 1 — pin = qin Given N 
trials with the same probability Pin the probability of x occurrences of 
A, (x = 0, 1, ---, N) is given by the familiar binomial distribution 


N . N N! 
(9.1) fü (") Pin Gi 7» (*) = TN»! 


194 IDENTIFICATION AND ESTIMATION CH. 9 


In applications of the general model, presented in the following chapters, 
we seldom have more than one trial for estimating a given probability Pin 
because the probabilities change for each individual on each trial, and 
usually by a different amount. When we have but one trial, N — 1, and 
the above equation for the binomial reduces to the statement that 
SO) = qin and f(1) = p. Then it may be proved that the unbiased 
estimate of p;,, is 1 when z;, = 1 and is 0 when T, — 0. With as little 
information as this to estimate the p;, the reader may feel that the whole 
estimation problem is a bit hopeless. But this is not so, at least in the 
special cases we present. When two operators Qı and Q, commute 
(see Chapter 8) the probability p;, on trial n is independent of the particular 
order of A,’s and A,’s up to trial n, and depends only upon the number k 
of A, occurrences and the number n — k of A» occurrences. In other 
words, all sequences with k occurrences of A, and n — k of A, terminate 
in the same probability value, which we denote by p,,. In experimental 
records of groups of subjects we may find several such equivalent sequences, 
and so have several subject-trials which can be used to estimate prn. 
When this happens the estimation problem is simplified greatly. Of 
course, a price is paid for this. We are forced lo assume that all subjects 
have approximately the same parameter values, or else we are estimating 


some sort of average parameter. Such simplifications are discussed in 
connection with several applications. 


9.5 MONTE CARLO CHECKS ON ESTIMATES 

Procedures for estimating the parameters are described in the following 
sections of this chapter and in later chapters. Often it is helpful to have 
à check on the estimation procedure. Suppose that data from an experi- 
ment did behave according to the model, that is, the true probabilities 
of the responses changed according to the postulated rules. We would 
still want to know how good a particular estimation procedure was in 
extracting from the data the true values of the parameters. Such a check 
is readily available. One need only fabricate a set of data using the 
Monte Carlo method or *'stat-rat"^ procedure described in Section 6.2. 

In making a set of Monte Carlo computations first set up the desired 
form of the operators Q, and Q,. For example: Qj p = a p + (1 — 04) 
and Q; p = «sp + (1 — a). Nextselect a value of the initial probability, 
Po Of response 4, and choose numerical values for the remaining para- 
meters, for instance, «, and o, for the example just given. Then make a 
number of Monte Carlo runs as described in Section 6.2. From these 
runs we obtain sequences of responses 4, and Asi these sequences are 
the data desired. 


Having obtained a number of sequences of responses, proceed as with 


_ 


SEC. 9.6 SIMPLE STATISTICS OF THE DATA 195 


a set of experimental data by applying the particular estimation procedure 
under study to obtain numerical estimates of the parameters. Finally, 
these estimates can be compared with the known values used in making the 
Monte Carlo runs. Needless to say, the greater the number of sequences 
computed, the better are the estimates of the parameters if the estimation 
procedure is a good one. Some examples are given in the next section. 


9.6 SIMPLE STATISTICS OF THE DATA 
In most experimental applications considered in the following chapters, 
the data for a single subject can be characterized by a sequence of 4,’s 
and 4,’s. For example, a rat will generate a sequence of right and left 
turns in a T-maze, or a dog will produce a sequence of avoidances and 
non-avoidances in an avoidance training experiment. From such 


sequential data on a number of subjects, numerous simple statistics can 
ameters 


be computed. Sometimes these statistics can be related to the par: 


in the model to provide a direct method of estimation. 
One simple statistic is the mean number of trials before the first A, 


occurrence. This is readily computed from the data, and we call it 5 
(F for “first” and the subscript 1 for 4,). We have been able to obtain 
a simple closed expression for the expected value of such a quantity only 


when 2, = 1, that is, when Qg is defined by 

(9.2) O.p=%p+(I — ag). 
Equation 8.31 gives the theoretical expression for th 
trials before the first A, occurrence in this case. 
estimation equation* 

(9.3) F Sge) r=) À 
where ® is a function given in Table A, and where do is the initial 
probability of response Ay. We can readily compute a similar statistic, 
Fo defined as the mean number of trials before the first A, occurrence. 
When 2, = 0, that is, when 


e expected number of 
Hence, we have the 


(9.4) Qip =P 
we have the estimation equation 
0.5) Fy PoP po) a= O: 


*is estimated by" or “estimates.” On 
and on the other side a mathe- 


* Here and elsewhere the symbol + means 
one side of this symbol there is a function of the data, H WI : 
matical function of the parameters being estimated. From classical statistics, if m D 
the true mean of a distribution and Ẹ the mean of a random sample drawn from this 
distribution, the notation could read either &Z m (the sample mean estimates the 
population mean) or m 2-2 (the population mean is estimated by the sample mean). 


196 IDENTIFICATION AND ESTIMATION cH. 9 


When the model for a particular experiment involves Ay = 1, we can use 
the statistic F, to get a relationship between p, and a, and when it involves 
4, = 0 we can use F, to get a relationship between py and ø}. These 
relationships, along with others to be discussed next, can lead to estimates 


of the parameters. For the special case when x, = 1, that is, when 
(9.6) Qs p = p, 

equation 9.3 simplifies to 

(9.7) Fi = qolpo (x, = 1). 

Similarly, when %, = 1, that is, when 

(9.8) Q. p — p, 

equation 9.5 simplifies to 

(9.9) Fy * Polqo (x, = 1). 


Another simple statistic of the data, denoted by Si. is the mean number 
of trials before the second A, occurrence (S for "second"). We computed 
the expectation of this statistic when 4, = ).= 1; from equation 8.38 
we can get an estimation equation for this case. Similarly, when 
A, = A, = 0, we can get an estimation equation for $,, the mean number 


of trials before the second A, response. For the special case when 


Àj = l and æ = 1, that is, when 
(9.10) Qi p = % p + (1 — a) 
Qop=p, 
we obtain, from equation 8.58, 
is 1 — og? 
9.11) ee nn, a 
f i (1 — qol — aqu) e Ie m D. 


Similarly, when 4, — 0 and % = 1, that is, when 


(9.12) Qip=p 
Qs p = 2, p, 
we have the estimation equation 
= t= H 
(9.13) Sase egg o4 = 1). 


(1 — po) — Zə po) 
For the special cases mentioned, the Statistic $, or S; yields another 
relation among po, %, and zy. If one of these parameters is known or 
assumed, this relation along with the one obtained from F, or F, may be 
solved for the unknown parameters. We now give two examples. 


SEC. 9.6 
SIMPLE STATISTICS OF THE DATA 197 


Fr 
er one hundred stat-rat runs, made with 2, = 1 and a, = 1, men 
: = E E 
e t " m 8.5, we obtained 1.17 for the mean number of trials 
he first 4, and 3.27 for the mean number of trials before the second 


A,. From equation 9.7 above we get 
Jo 
" 1 — qo 
eso i ati 
Ive this equation and get as our estimate* of qq, the value 


(9.15) Jo = 0.54 


(9.14) Retin = 


From equation 9.11 above we get 
(9.16) §,= 3072 Lu 


(1 — qo) — ead) 


When we í 5 
use the estime = i i i i 
hc uda. E imate Jọ = 0.54 in this equation, we obtain as our 
(9.17) à, — 0.97 
17 0.97. 


b A actually used in making the stat-rat runs were do 
Sere ce : . Although the estimates are not as close to the true 
easily da would like, we were able to obtain these estimates very 
. Later we shall see how better estimates can be made from these 

Stat-rat data. 
de : A we consider thirty stat-rat computations made 
"qu "m Kia | and pọ=0. The parameter values a, — 0.70 and 
Fio the the used, but a, and x, will be estimated from the data. 
y sequences we found that F, = 5.67. Equation 9.3 with 


1o = | gives 
9. 
(9.18) F, = 5.67 = eas, 1). 


='0.50 


From Table A, we find by interpolation that 

TM ĉa = 0.950. 

n making the computations. 
s often used when “perfect” 


For example, in the 
ntually stop making 


i happens to be the precise value used i 
MALE ira of learning data which i 
Tease: eh pen is the total number of errors. 
NUI sd I on one side only, a rat will eve 
TANAR a pas in the Solomon-Wynne experiment. on avoidance 
Gis been rmal dog seems to have learned to avoid without fail after 
e shocked on a certain number of trials. Denote the total 


*Her i i 
€, as elsewhere, the circumflex (^) over a parameter denotes an estimate of that 


parameter. 


198 IDENTIFICATION AND ESTIMATION cH, 9 


number of A, occurrences by T, (T for "total"). In Section 8.2 it was 
shown that, for 2, = 2, = 1, the expected value of T, is approximated by 
equation 8.18, and so we have the estimation equation 


—log [ s. Ba S wn 


1—:66; 


(9.20) 


he — 0 - 


When a; = 1 this simplifies to 


(9.21) Tze (= 1, % = 1). 

When all sequences approach zero asymptotically a useful statistic is T,, 
the mean total number of A, responses. In experimental extinction, for 
example, T, is some finite number. In Section 8.2 we showed that for 


4, = 2a = 0, the expected value of T, is approximated by equation 8.15. 
Hence we have the estimation equation 


—log [ 608 — xd 


(9.22) han =E jsh : 0), 
91 — Xs 


and for % = 1 this becomes 


7o. —log 

(9.23) Ra T (Ae — 0, a = 1). 
— s 

We now give an example of the use of these estim 


W £ ation equations involving 
T, and T,. 


The thirty stat-rats, mentioned above, fo 
gave the value 7, 
4o = 1 and get 


r which py = Oand h= Age ds 
= 84. We may now use equation 9.20, We take 


1 — a, 
—log 2 


(9.24) E E rn 
X» — ty 


= T(25, a), 


where T(2s, o) is the function given in Ta 


ble B. We then use the value 
ĉ = 0.95 obtained from the statistic F, above and, turning to Table B, 
we find 2 


(9.25) à, = 077. 


This is to be compared with the true value % == 0.70 used in making the 
Stat-rat runs. 


Various other statistics can be computed from Sequential learning data. 


SEC. 9.7 BIAS AND VARIANCE OF ESTIMATORS 199 


For example, the number of alternations, or the mean number of trials 
before the third A, occurrence, might be of interest. Such statistics 
depend directly on the parameters in the model, but we have not investi- 
gated them. Instead we turn to some more efficient procedures for 


estimating the parameters. 


9.7 BIAS AND VARIANCE OF ESTIMATORS 


Various procedures for estimating parameters are known to statisticians. 
Some of these are better than others for particular purposes. For 
example, we might flip a coin 100 times to estimate the probability p of a 
head. It can be shown that the “best” estimate of p—best according to 
explicit criteria—is the proportion of the 100 trials on which a head 
appears. But there are other estimates of p available. We could use, 
say, the proportion of heads during the first thirty trials or the proportion 
of heads on every other trial. Or we might count the number of times 
We get two heads in a row, use this to estimate p^. and take the square root 
of the result as an estimate of p. Many other estimators of p could be 
devised, but none seem better than the obvious one, the proportion of 
heads in the total 100 trials; it is an unbiased estimate. 

It is desirable but not mandatory that an estimator be unbiased. To say 
that an estimate is unbiased is to say, in essence, that if we repeatedly 
draw samples and compute the estimate for each sample, the mean of 
these estimates will get nearer and nearer the true value of the parameter 
as the number of samples drawn gets large. More strictly, we may 
conceive of a distribution of all possible values of the estimate, and, if 
the mean of this distribution equals the true parameter value, the estimate 
is unbiased. If the estimate is biased, the mean will not equal the true 
Parameter value, Although unbiasedness is desirable, statisticians will 
often sacrifice this property in favor of other important properties. 

It is important to make clear what biased and unbiased estimates are, 
because the emotional tone of the word “bias” is so strong: We EX: 
frequently work with biased estimates. One reason for this is tha 
Sometimes there are no unbiased estimates of the parameter we E 
interested in estimating. Sometimes there may be an unbiased estimate, 
and we are unaware of it. On the other hand, some of our biased esti- 
mates may have very little bias (difference between mean estimate and true 
parameter value) when the sample sizes are large. 


Another desirable property of a good estimator i i 3 
Variance. Again, we conceive of a distribution of all possible values o 


the estimate, obtained from a very large number of samples of the x 
Size drawn from a population. This distribution of estimates, pone 
around the true value, will have a variance, and we want it to bessisma 


is that it have a small 


200 IDENTIFICATION AND ESTIMATION cu. 9 


as possible. We will not usually know whether our statistic has smallest 
variance, but we will usually have the property that as the sample sizes get 
large the variance of the estimator will tend to zero. If the bias tends to 
zero at the same time we shall have what is called a consistent estimator. 
Unbiasedness and minimum variance are two properties related to the 
more fundamental but vaguer notion that a good estimate has a distribu- 
tion tightly clustered about the true value. Fora searching discussion of 
criteria for estimates, see Savage [13]. 

Not for all the estimators we describe will it be possible to compute 
numerically the variance of the estimate. Whenever we can we shall do 
so. Though the variance of an estimate can be approximated by making 
Monte Carlo computations as described in Section 9.5, this technique 
ordinarily involves a great deal of labor; we need to make many sets of 
Monte Carlo runs to approximate the variance of the estimate. 


9.8 MAXIMUM LIKELIHOOD ESTIMATORS 


The principle of maximum likelihood, developed by R. A. Fisher in the 
early 1920's, yields one of the most important estimation procedures 
known to mathematical statistics. We shall use this method in some of 
the following chapters, and so we give a brief and elementary exposition 
here. The reader is referred to several standard texts for more complete 
discussions of maximum likelihood [6, 7, 8]. 

The basic idea is simple enough. We take, as the estimate of a para- 
meter, that value which gives the greatest possible likelihood of obtaining 
the data actually observed. Moreover, the computational procedure is 
in principle straightforward. We write down the likelihood function P, 
which is the probability of obtaining the observed data, in terms of the 
parameter to be estimated. We then find the ve 


| ilue of the parameter that 
makes P as large as possible. Consider a simple example. We have a 


coin with an unknown probability p of coming up heads. Suppose we 
flip the coin 10 times and obtain 7 heads and 3 tails. Common sense 
would tell us that a good estimate of p is 0.7 in this case, but let us see 
what the maximum likelihood estimate is. The probability P of getting 
precisely 7 heads and 3 tails in a particular order is 
(9.26) P— BR py 

We now want to choose p so that P is a maximum. The standard 
procedure is to differentiate P with respect to p, set this derivative equal 
to zero, and solve for p: 


dP 


ae 61 58 - 7 2 
(9.27) di 7p*(1 py 3p'(1 — p) 


= pO pTO — p) — 3p] = 0. 


SEC. 9.8 MAXIMUM LIKELIHOOD ESTIMATORS 201 


The solution which makes P a maximum (rather than zero, the minimum) 
is p — 0.7, as we intuited! This is of course the value of p that makes the 
quantity in brackets vanish. Hence the procedure yields the obvious 
estimate. However, maximum likelihood estimates often do not turn 
Out to be the intuitively obvious estimate, nor are they always, oreven often, 
unbiased in small samples. They do have one property that endears them 
to us though; if there is an estimate which has in large samples the smallest 
variance, then the maximum likelihood procedure will usually find it. 

As another example of the maximum likelihood procedure we choose 
a simple case of our learning model. Suppose that we have the operators 


Qip — ap + (0 — 9) 


Q: p = sp. 
Assume that we know that the initial probability po has the value 0.2, and 
we wish to estimate the single parameter æ. First suppose we observe 
the sequence A44. The likelihood function is then 


(9.29) P = pol — Qi Po) = Pox — Po)» 


It is obvious that P is a maximum when g = 1 (the largest allowed value 
Next, suppose 


of a), and so this is the maximum likelihood estimate of «. 
We observed the sequence Adı. The likelihood function 1s then 


eam P= poQi Po = poll 21 — Poll, 
this is the maximum 
which 


(9.28) 


and it is clearly largest when « = 0. Therefore, 
likelihood estimate of «. Finally, consider the sequence AAA), 


gives the likelihood function 
(9.31) P= pl = Qi Po) 9s: Po 
== pgx(1 — pol! — a(l — pol. 


We now take the derivative: 


(9.32) dP = poll — poli2u — 3221 — poll = 0 
da 
The appropriate solution is 
2 
9.33 DP pay 
(9.33) *= 3(1 — Po) 


Since we assumed p, was known to be 0.2 we find that « = 0.833. € 
may note that for pọ > 1/3, this solution gives a value of x greater than 
unity. In such a case we take æ — | to be the maximum libellas 
estimate since « can never be greater than unity. In general, we must be 


202 IDENTIFICATION AND ESTIMATION cH. 9 


cautious about setting the derivative of the likelihood function equal to 
zero, as the function may not have an analytic maximum (maximum with 
zero slope) in the allowed range of the parameter being estimated. 

The method of maximum likelihood has very wide applicability. It 
usually leads to a solution, but unfortunately computational difficulties 
are often tremendous. This seems to be true when we try to use the 
method for obtaining simultaneous estimates of all our parameters in the 
general case. The likelihood function itself is embarrassingly lengthy, 
when we have, say, twenty-five trials for ten subjects. And then this 
function would have to be maximized with respect to the five variables 
Ay, %, Žo, Xs, and po, simultaneously. The procedure seems completely 
unfeasible except for high-speed machine computing of particular examples. 
But a program is available if we care to expend the necessary labor. In 
the following chapters we use the maximum likelihood method in a more 
restricted way. We use only a portion of the data and usually estimate 
only one parameter at a time, sometimes two or three. The results are 
quite simple in some cases. We give up some information to obtain this 
simplicity. 

When we obtain a maximum likelihood estimate it is sometimes easy 
to compute the asymptotic variance of the estimate. It is well known 
from mathematical statistics that the asymptotic variance o? 


T is the negative 
reciprocal of the expected value of the quantity 


(9.34) Ld log P. 
de P 
a P is the likelihood function and 0 is the parameter being estimated, 
that is, 
(9.35) C= l 


Sas 
d? log PY 
= E( E 
d? 
The second derivative ordinarily involves both the true parameter value 
and the observations. When we take the expected value we often are 
taking expected values of simple functions of the observations, and so it 


Is sometimes easy to get the variance. For example, in the binomial case, 


if we observe x successes in 7 trials, and the probability of a success on a 
single trial is p, we have 


P= p — py 
(9.36) log P= x log p + (n — 2) log (1 — p) 


dlogP x n—zr 


d p imp 


SEC. 9.9 A SPECIAL MAXIMUM LIKELIHOOD PROBLEM 203 


When this derivative is set equal to zero and solved for the maximum 
likelihood estimate, we get p= z/m as expected. Taking the second 
derivative, 

(9.37) a log PX z p= a 

dp? p Cap 

But the expected value of x is np, and the expected value of n — x is 
n(l — p), so 


T? log P n n n 
(9.38 E ( E ) S = ; 
i dp* tpt ip mp 
The reciprocal of this is the well-known result — - 
T 1—p) 
(9.39) dde RP, 


In this instance the variance and the asymptotic variance are identical. 

We shall neither prove this theorem about the variance of maximum 
likelihood estimates nor state the conditions under which it holds, but 
we will make use of it in the following chapters. The description just 
given for estimating single parameters suggests that we would be in a 
position to get a good idea of the variance of our estimates when we use 
maximum likelihood and large samples. But we are cheating a little, 
and possibly a lot. If there are several parameters to estimate simul- 
taneously, maximum likelihood methods can give a good idea of the 
Variance and covariance of the estimates. Ordimarily, we estimate 
parameters singly, and in each case as if the values of all other parameters 
were known. Consequently our variances ought really to be adjusted 
for the fact that we do not know the values of all parameters except the 
oneinquestion. Finally, of course, we will not really know the variances, 
but only have estimates of them on the basis of the estimated parameters. 
There are, therefore, a number of loose ends lying around. 


9.9 A SPECIAL MAXIMUM LIKELIHOOD PROBLEM 
aximum likelihood equations which we use 


in the following chapters to estimate parameters from apom a 
The rest of this chapter is technical and specific to estimation of parame n 
in special cases of interest (see also [9])- The reader may care to skip 
forward to a treatment of an experiment, and return to this material when 
he needs it. This development is not applicable to all the problems we 
take up later, but it is used in Chapters 10 and 11. The procedure to be 
described is appropriate whenever we have a set of probabilities q, related 
by the equation 


(9.40) q, = tqo- 


We now develop some m 


204 IDENTIFICATION AND ESTIMATION CH. 9 


The data provide information about the q,'s, and the problem is to estimate 
« OF qo or both. 


Equations of this type occur, for example, when we apply an operator 
» of the form 


(9.41) Oop — zs p + (1 — 23), 
on every trial for several trials. Applying this operator is equivalent to 
applying an operator Õ, to q=1—p: 


(9.42) Qu = Xa. 


AS we have seen, when we a 
for trial n, the probability 


(9.43) n= O24 = d"o. 


In this application, v in equation 9.40 is the trial number n. In other 
applications, however, v is not the trial number but denotes the number of 
occurrences of one response, Specifically, if one operator is an identity 
operator and the other is of the form Op = ap 4- (1 — x) or Op = ap, 
then again we get equations of the type 9.40, but » stands for number of 


occurrences of the response which increases or decreases p. In Chapter 
10 we use the Operators 


pply such an operator on every trial we have 


(9.44) Qip — e p 4- (1— 24) 
Qp — p, 
or the equivalent Operators 
(9.45) Ca ay 
Og = » 
The probability q, of response Ay after » occurrences of response 4, is 
(9.46) 


T= Qh = 2o- 
The estimation problem is essentially the 


number in the first example given above, or for number of occurrences of 
A, in the second example. For this re. 


3 ason we discuss this estimation 
problem here rather than In connection with the particular applications 
later. 


The problem in estimation may now be summarized 
probability of some alternative 
and we have the relation 


(9.47) 


where we assume 0 < g ee. 


Same, whether » stands for trial 


3 as follows. The 
4; is q, for a specified value of the index v, 


q, = aqq, 


For each value of. v, we have a number of 


SEC. 9.9 A SPECIAL MAXIMUM LIKELIHOOD PROBLEM 205 


observations concerning whether or not alternative 4; occurred. These 
Observations may be on a single subject or from a number of subjects. 
We let the number of observations for a specified value of v be N,, and 
we use the index x to denote these N, observations, that is, “= 1, 2, +++, 
N, The wth observation for a particular value of » is simply whether 4; 
occurred or not. We represent the data by a set of random variables x,,. 
We let z,, — 0 if A; occurred and X „= lif it did not. The data are 
thereby reduced to a set of 0’s and I's. These could just as well have been 
checks and pluses or "yeses" and “noes,” but the O's and l's have a 
distinct advantage, as we shall see. We can define a quantity x, by 


(9.48) x= > Ly 


This sum is simply the number of times A; did not occur on N, observations, 
for we enter a zero in the sum when 4; occurs and the number one when 
A; does not occur. The number of occurrences of A; during the N, 
Observations is of course N, — v, The data give the values of all the 
Xav S, and from these we want to obtain the maximum likelihood estimates 
of « and qo. 

When we say that m, = 0 we are saying that A; occurred on the uth 
observation for a specified value of ». The probability that 4, occurred is 
q» and so q, is the probability that x, = 0 and 1 — q, is the probability 
that 2, — 1. Therefore the expected value of x, is (for fixed N,) 


(9.49) Ela) = 1— 4, 


(9.50) E(x,) = Y E(x 


pel 
Therefore, an unbiased estimate of 1 — q, is the ratio a,/N,. Thus, an 
unbiased estimate of q, is 


(9.51) 


= NA =q: 


nr 


This is the obvious estimate of g,, the proportion of occurrences of A; 
in the N, observations. From our data we can thereby obtain estimates 
of all the q, We could then combine these estimates in some way, for 
the various values of v, and get estimates of x and qo through equation 
9.47. But we wish to obtain the maximum likelihood estimates of « and 
Jo, and so we consider all values of » simultaneously and avoid the necessity 
of estimating each of the q, and then combining them. o 

We want to write down the probability or likelihood of obtaining the 
entire set of x s. We begin by writing down the likelihood P,, of 


pr 


obtaining these numbers for single values of v and x. From the last 


206 IDENTIFICATION AND ESTIMATION CH. 9 


paragraph we see that 


Ig, aM o4 
(9.52) Po a ‘ 
% if t,=0, 
A convenient compact way of writing this is 
(9.53) Pu = (1 — geg, 


This is equivalent to the preceding statement (equation 9.52) since, when 
©, = 1, the first factor is 1 — 4, and the second factor is unity; and, when 
Xa» = 0, the first factor is unity and the second factor is g,. Next we 
write down the likelihood P, of obtaining the set of & Tor j= 1, 2,24? 5 
N, but for a single value of v. It is just the product 

Ny 

PIDE, 
(9.54) ‘Ny 
=T] 0 — erg yz. 
n= 


This product is simply (1 — q,) to some power times q, to some other power. 
But in the whole product the exponent of (1 — q,) is just x, of equation 
9.48, which is the number of non-occurrences of 4; Similarly, the 


exponent of q, in the product is N, — x, the number of occurrences of A;. 
Hence, 


(9.55) P= qq), 


Finally, we want the likelihood P of obtaining the whole set of data 


given by all the z,'s. We let » range from 0 to some number Q. The 
likelihood P is then the product 


(9.56) "d 
= I {a - q,(q,) v7. 


We now insert the expression for 4, given by equation 9.47 and get 


[e] 
(9.57) P= IT (a — LC LN 
It gives the likelihood of 


mbers z, and N,, obtained 
to be estimated. 


SEC. 9.9 A SPECIAL MAXIMUM LIKELIHOOD PROBLEM 207 
equation we then get 

Q 
(9.58) log P= X (x, log (1 — «"qo) + (N, — &,) log («'go)}- 

r=0 


We now use the standard procedure for maximizing this expression. 
First we take the partial derivative with respect to « and get 


Q 
D —ya* 1g, v 
= > x L(N, — z) 
à log P l T= (N, 2) = 


(9.59) EET 


If we know the value of go, we need only set this derivative equal to zero 
and solve for x, When we do this, we replace « with &, which is the 
maximum likelihood estimate of æ. We cancel out the common factor 


1/« and get 


a a : 

„ [220 | 

9. > sys > qu. 294 
(9.60) v(N, — z,) ^ [es 


v-0 r= 

This equation must be solved for ĉ, but unfortunately this is not easy. 
In the following section we propose some procedures for doing so. 

When we wish to estimate both « and gy from the data, the problem is 

a little more involved. We must return to the expression for log P, 

equation 9.58 above, and take the partial derivative with respect to qy: 


a 
: Tb eR 9g] 
dlo eid "| " 1 — eq, YE " do 


v= 


a 7 

SO 1 . “qo ) 
= > = fow, -nr r] 

mer 


atives of log P. In so 
stimate 2, and get the 


(9.61) 


We then set equal to zero both of the partial deriv 
doing we replace « by its estimate & and go by its € 
pair of equations 


a E ^v^ 
. v" Qo 
> »(N, — %) = > {=}. 


v=0 v= 


a Q Las: 

2Go |. 
> (N,—2)— T, qm 7 
v=0 


(9.62) 


v=0 


208 IDENTIFICATION AND ESTIMATION CH. 9 


These two equations must be solved for the maximum likelihood estimates 
& and Gg. This creates some rather serious computational problems; we 
discuss them in Section 9.11. 


9.10 PROCEDURES FOR COMPUTING & 


We now describe some procedures for computing the maximum 
likelihood estimate 2 when the value of qo is known. This involves solving 
equation 9.60 of the last section. The left-hand side causes no trouble 
for it is determined completely by the data; we call it Dy: 

Q 
(9.63) D, = > oN, — z,). 
v-0 
From a particular set of data we can tabulate N, and x, for each value of 
v and readily compute the value of the sum D,. But the right-hand side 
of equation 9.60 involves the v, obtained from the data in addition to the 
known parameter gg and the unknown estimate % The problem is to 
solve for &. We have found three procedures useful under different 
conditions. 

The first procedure is appropriate for the special case when qy = 1. 

We then have from equations 9.63 and 9.60 


(9.64) Ke uu, 90 


Introducing the abbreviation 


(9.65) 
— a” 
we can write 
Q 
(9.66) D, = Y zg (8) 


e function g (a) for a range of values of v 
and «. These tables, along with the x, from a particular set of data, 


permit us to compute the sum on the right side of the last equation 
relatively easily for any value of 5. This must be done for Various values 
of ĉ until we get as close as Possible to the correct value, D,. This is a 
trial-and-error procedure, but Table C facilitates the method considerably. 
We illustrate this technique in Chapter 11. 

The second procedure for obtaining 2 from equation 9.60 is useful in 
some kinds of data for which the 7, are equal for all values of v within the 
range from zero to Q. In other Words, we can Sometimes choose Q so 
that z, = x for v= D, 1,2, ... Q. Under this condition we can factor 
out x, = x and get from equations 9.60 and 9.63 


SEC. 9.10 PROCEDURES FOR COMPUTING & 209 


[e] 
p " 
(9.67) = > d 
x £11 — 80. 


We denote the function on the right by G(ĉ, qo, È); it is an example of 
the function 


Q 
vah 
9; = — 25 
(9.68) G(a, B, Q) 2. T— of 
which we present in Table D. We may then write 
(9.69) G(&, qo. 2) = Dy/*. 


From a set of data, we can compute D,/« and then use Table D to obtain 
the nearest value of & for the known values of qọ and Q. In Chapter 10 


we illustrate the procedure. TC 
The third procedure for determining the estimate ĉ is an approximation 


for the case of qo = | which is especially useful when & is near unity. 
We expand 2” ina power series around & = 1: 
(9.70) & = [1 — (1 — 8" = 1 — (1 — 8) T: 


The annoying factor in equations 9.60 and 9.64 then becomes 
g —— l-93r 


I-&  x«I-8—- 


(9.71) 


We drop terms beyond the linear one shown and have then, from equation 


9.64 
1 — »(1—&) 
v=0 
(9.72) 
Q 
Sa Q 
nus E Yr, 
c 
r=0 
But D, is defined by equation 9.63, that is, 
Q Q 
(9:73) D,— XN, — X vt, 
r=0 r=0 
Combining the last two equations gives 
$a 
(9.74) &c1— fg 95—. 
Son, 


210 IDENTIFICATION AND ESTIMATION cH. 


This approximate formula may be used to estimate x directly from the 
data without the use of tables. Even when & is not very near unity, the 
foregoing formula is useful in obtaining a preliminary value of & Having 
such a preliminary value shortens the exact computations described above. 


9.11 PROCEDURE FOR COMPUTING ĉ AND qo 
When we wish to obtain the simultaneous maximum likelihood estimates 
of « and q for the problem discussed in Section 9.9, we must solve a pair 
of equations (9.62). This ordinarily involves a great deal of computational 
labor. We provide a short-cut procedure for only one special case—when 


the z, are independent of v over the range of values of v, that is, when 
z,— zforv—0,1,--., 
From the data we obtain two statistics D, and D, defined by 


n 
D, — > WN, — x,), 
(9.75) r 


Q 
Dy E (N,— 2). 
v=0 


In Table D we give the functions 


(9.76) 


From equations 9.62 we then have 
GG, Go, Q) = Dv, 
Fê, Go, Q) = Dyje. 


Having computed 2, D, and D, from the data, we may use Table D to 
find the nearest pair of values of the functions F and G and thereby 


obtain the estimates 4 and Qe. Because interpolation is usually necessary 
we illustrate the procedure in Chapter 10, 


(9.77) 


9.12 VARIANCE OF THE ESTIMATE 3 
We now return to the case described in Section 9.10 for which qo is 
known and we compute the estimate &. As indicated in Section 9.7, it is 
desirable to have an estimator which has a small variance, and so we 
inquire about the variance of our likelihood estimate 2. 


SEC. 9.12 VARIANCE OF THE ESTIMATE & 211 


At the end of Section 9.8 we stated a well-known theorem about the 
asymptotic variance, o*(0), of an estimate 6 of some parameter 0. In our 
problem the parameter is « and the estimate is 2. Thus we need to 


compute the second derivative of log P of equation 9.58 with respect to a. 
From the first derivative given by equation 9.59 we get 


Q 
o? v END | 
ga 8P PA a[e-9- 2% 


(9.78) 
ra = i eroe x 
a [i1—«q« Aeg 1 

or 
Q ji 

s v " EC) 
aj 08 P t fw, s) — nT s, 

r=0 

(9.79) 
" qo 


We now need the expected value of this second derivative. We see from 
equation 9.50 that 


(9.80) E(x,) = N1 — 4) = NO — #'qo). 


Thus we get 
Q 


Cii " ' . vat 
—E (= log P) = >5 N, fi (1 = ago) — 2°90 + 7 E 


Qo? 
(9.81) 


p odo 
we E a'qo 
r=0 
From the theorem then we have, for the asymptotic vari 
2 


ance of &, 


(9.82) a= -r 


XL. 
1 — «qo 


r-0 
We may estimate o*(&) by replacing « with its estimate & in the right side of 
T bles has been neglected. 


this equation. That N, and Q random varial PA 5 
The expression just obtained for the asymptotic variance of & leads to a 
considerable amount of computational labor, but when gy = 1 we may 


212 IDENTIFICATION AND ESTIMATION GH. 9 


use the approximation introduced in Section 9.10. Using equation 9.71 
in the sum, we have 


Q [e] 
" a” 2N 1 — »(1 — a) 
2» Wl P V 
PED »—Ü 
(9.83) 
a 
EN, d 
eh c 2 »N,. 
1l —« 


r=0 
The asymptotic variance is then given approximately by 


a(1 — a) 
n 2 (qo = 1). 
XN, — (1 — a) x PN, 
r=0 r=0 


(9.84) Pâ) e 


This approximate formula is much easier to use than equation 9.82, as 
we see in Chapter 11. 


9.13 GOODNESS-OF-FIT CONSI DERATIONS 


In applying the mathematical System to a particular experimental 
problem, that is, in constructing a model for that problem, there are three 
major considerations: (1) identifications, (2) estimation of parameters, 
and (3) goodness-of-fit. We have already discussed the first two con- 
Siderations, and in this section we comment on the question of goodness- 
of-fit. How well does the model account for the data? 

Several statistical techniques are available for testing for goodness-of-fit, 
but most of these are not appropriate for the analyses in the following 
chapters. The major reason is that our model implies a distribution of 
probabilities on each trial. Consider first the most common criterion, 
the Pearson chi-square (7?) test. If we have a model that predicts that in 
N observations we have Np successes and NI — p) failures, and if we 


observe x successes and N — x failures, the Pearson test statistic is (without 
Yates’ continuity correction) 


p ENP INN (eL Nop 

x Np : NC — p) = N= p) 

If this statistic is larger than some critical value, found in a 2? table, the 
fit is considered unsatisfactory, that is, the hypothesis that the true prob- 
ability of a success is p is rejected. Now Suppose that the model predicted 
that the sample of N observations was stratified in such a way that N, had 
à probability p; of a success and N, had a probability p, of a success, 


(9.85) 


SEC. 9.13 GOODNESS-OF-FIT CONSIDERATIONS 213 


where N, + Ny — N. Furthermore, suppose that we did not know which 
Observations were associated with p, and which with pẹ so that we knew 


only the total number of observed successes, x. We might be tempted to 


replace p in the foregoing equation for z with the mean probability 

Np Nope _ Ny pı + NoP2 
N, + Ne N 1 
But this would be wrong as we now show. Suppose p, = 0 and p, — l; 
then the model predicts precisely N, successes, that is, « = N,. If» z^ Nj, 
the model is wrong, and nothing further need be said, for the probability 
of observing anything but N, successes is zero, assuming that the model is 
correct. In other words, the distribution of the number of observed 
successes is discrete with unity density at x= N, and zero density else- 
where. The distribution of the quantity on the right side of equation 
9.85 is not the 7? distribution, and so the Pearson criterion is not appro- 
priate for this problem. 

Now consider a less extreme case than the one just discussed. Suppose 
that p, were small and p, were large. The variance of the observed 
number of successes is small compared to Np(1 — p). The variance of 
the observed number of successes is, in fact, F 
(9.87) o%(a) = N p(l — pi) + Napl — po) = NP — (Nip? + Nope"). 


The variance of the p values is 


(9.88) ox p) = 


(9.86) p- 


Ni pè + Nip? — pt. 
N 


Therefore 
o%X(x) = Np — [NoX p) + NP] 
= Np(1 — p) — NoX p). 

We see that when p, = 0 and p,— 1, then o°%(x) = 0, whereas when 
Pi — pa — p, o*( p) — 0, and so 0%(x) = NE(1 — P). Now the 7? test 
for goodness-of-fit, given by equation 9.85, has the variance ofthe eae 
number of successes in the denominator, and as we have just seen this 
variance is not Nj(1 — p) except when o p) = 0. 

In Chapters 10 and 12 we find exceptions to tl 
about the inappropriateness of a 7? test for goodness-of-fit. In a 
10, there will be a large collection of observations for gan sam 
P value is appropriate, whereas in Chapter 12, all subjects have the same p 
value on a given trial. Under these conditions the Pearson criterion can 
be applied. . . 

A possible test statistic is suggested by the discussion nae 
model predicts that on trial 7 there will be NV,,, responses o type A; 


and that the second raw moment of the p-value distribution is Vs ,. 


(9.89) 


he foregoing argument 


214 IDENTIFICATION AND ESTIMATION CH. 9 


Analogous to equation 9.87, the expression for the variance of the observed 
number z, of A, responses is 


(9.90) a(n) = N(V,, — Vs). 
The proposed test statistic is then 

= NV)? 
(9.91) =s% 15) 


u= —. 
NV T Von) 
If the data for all trials were used, we suggest the test statistic 


(2, — NV, 
9.92 U= TT: 
= NU, — Va) 


Inventing reasonable test statistics is easy, but the problem is to deter- 
mine the distribution of such a statistic. It seems plausible that the 
Statistic U has approximately the 7? distribution. The reasoning is that 
if x is a normally distributed variable with mean m 
(x — m)*/o* has the 7? distribution. 


according to 7? with 
z? values has a z? 
n of the component 
ack some degrees of freedom 
But the crucial point concerns 
If on each trial we had a brand 


of-fit obtained by two or more methods. 
A goodness-of-fit test that we will use 
an extremely simple test to use, 
observed proportion of A's is abo 
mean /;,. We observe a sequence 


is the run test [10, 11, 12]. It is 
We note on each trial whether the 
ve (+) or below (—) the theoretical 
of +-’s and —'s such as the following: 
VMN Th de deed ipem eec eeu e 


We then count the number of F's, m, the number of —'s, ns, and the 
number of runs, d. In the example above, n, = 9, ny — 7; and d=. 
We then ask if the number of runs is too large or too small, granted that 
the model is correct. Swed and Eisenhart [10] have computed tables of 
the probability of obtaining runs of various lengths for n and n, from | to 
20. For larger values of n and ny we can determine the expected number 
of runs from the formula 


(9.93) E(d) — ELS 


n + ns 


SEC. 9.13 GOODNESS-OF-FIT CONSIDERATIONS 215 
and the variance from 


(9.94) o? 2nyny(2nyng — ny — No) 


(n + ng, + Mg — 1) 


We then compute a normal deviate 
d — E(d) 


ga 


(9.95) — 


> 


and consult a normal table to test for significance [12]. 

We have been discussing the problem of testing for goodness-of-fit in a 
rather technical way without making it very clear just what we expect of 
our general model. As we have said elsewhere in this book, our goal is 
to describe data adequately. This notion becomes more or less precise 
only when we specify what we mean by adequate. Another way of saying 
the same thing is that we want to account for most of the variability in 
learning data with our general model. But then what is “most” of the 
variability? Is 95 percent necessary? Answers to these questions are 
deeply rooted in a person’s basic philosophy of scientific method. What 
degree of perfection is required, and for what purpose? Suppose that a 
model accounted for only 50 percent of the variability in a set of data. 
Would we therefore reject the model as useless? Our answer is negative 
because later we might account for most of the remaining 50 percent of 
variability by considerations outside the scope of the model. For 
example, 40 percent of the variability might be a result of individual 
Subject differences, in which event the model would have been excellent 
for identical subjects. Furthermore, the model may provide a framework 
—a base-line—for analyzing individual differences, and this alone would 
be useful. In other words, we do not insist upon a narrow acceptance 
region in model building. Models of the sort we are studying make 
hundreds of predictions about a single set of data. When the predictions 
for many properties are accurate, while inaccurate for others, we do not 
have a simple acceptance-rejection problem that falls into the pattern of 
Classical tests of significance, that is, we do not make one test of significance 
at the 5 percent level and let the model stand or fall by this result. Science 
does not move this way. Rather, the goodness-of-fit tests give us infor- 
mation about the satisfactory and unsatisfactory aspects of the model. 

One final point on the broader question of goodness-of-fit: We do not 
expect any general model such as ours to describe adequately all experi- 
ments in learning. Many experiments cannot easily be made to fit into 
our basic paradigm of mutually exclusive alternatives. Others may fit 
the pattern in principle but may lead to serious mathematical compli- 
cations. But even some of the experiments which are easily handled by 


216 IDENTIFICATION AND ESTIMATION cH. 9 


our general model may not be well described by it. If this occurred only 
infrequently we would not feel that the general model was doomed and 
henceforth useless. We would immediately ask why a particular set of 
data led to poor agreement, and this might lead to significant further 
experimentation. Without the model we could not even have asked the 
question! A model which does adequately describe a fairly wide range of 
data may be used as a device for determining conditions under which an 
extended model or a different model is needed. 


9.14. SUMMARY 


The three major problems in applying the mathematical system of Part I 
to the analysis of particular experiments are discussed in this chapter. 
First we consider questions in identifying elements of the mathematical 
system with observable aspects of the behavior and environment of an 
organism. We discuss possible correspondence between our general. 
model and two main psychological theories, and how model parameters 
may depend upon some important experimental variables, Next we 
indicate the nature of the general problem of estimating parameters of the 
model from experimental data, and we present several procedures to be 
used in the following chapters. Finally we discuss some methods for 
measuring the goodness-of-fit of a model to particular data. 


REFERENCES 

l. Postman, L. The histo: 
1947, 44, 489-563. 

2. Hull, C. L. The principles of behavior. 
1943. 

3. Spence, K.W. Theoretical interpretations of learning. Handbook of experimental 
psychology, S. S. Stevens, ed., New York: Wiley, 1951. 

4. Guthrie, E. R. The psychology of learning. New York: Harper, 1935. 

5. Guthrie, E.R. Association and the law of effect. Psychol. Rev., 1940, 47, 127-148. 


6. Kendall, M. G. The advanced theory of statistics, Vol. I. London: J. B. Lippincott 
Co., 1943, pp. 178-180. ý S oiii 


7. Wilks, S. S. Mathematical statistics. 
1943, pp. 136-142. 
8. Mood, A. M. Introduction to the theory of statistics. 
1950, pp. 152-154. 
9. Bush, R. R., and Mosteller, F. A stochastic 
Annals of math. Stat., 1953, 24, 559-585, 
10. Swed, F. S., and Eisenhart, C. Tables for testing randomness of grouping in a 
sequence of alternatives. Annals of math. Stat., 1943, 14, 66-87. 
11. Hoel, P. G. Introduction to mathematical Statistics. New York: Wiley, 1947, 
pp. 177-182. 
12. Mood, A. M., loc. cit., pp. 390-394. 


13. Savage, L. J. The foundations of statistics. New York: Wiley, 1954, pp. 220-245. 


Ty and present status of the law of effect. Psychol. Bull., 


New York: Appleton-Century-Crofts, 


Princeton: Princeton University Press, 
New York: McGraw-Hill, 


model with applications to learning. 


CHAPTER 10 


Free-Recall Verbal Learning 


10.1] THE EXPERIMENTS 


y during the last several 
Many experiments have 
lables, and psychologists 
e, the comparative ease 
Because of this 
tained, 


An important area of experimental psycholog 
decades has been that of verbal learning [1]. 
involved memorizing a list of words or nonsense syl 
have intensively studied serial effects, for exampl 
of memorizing words at the ends and middle of a list. 
interest in serial effects, the order of the words in the list was main 
from trial to trial, in most of the experiments. 

These experiments on serial rote learning have 
and have led to a number of useful concepts [2]. From the point of view 
of quantitative models for learning, however, the problem of serial 
learning or the chaining of responses is a complex one. A much simpler 
m p results when the order of the words being memorized is 
afecte be on each presentation of the list. In this way. the serial 

are eliminated or at least minimized, but the basic learning or 
Memorization process remains. Such experiments have been conducted 
by Bruner, Miller, and Zimmerman [3]. The model developed by Miller 
and McGill [4] was tailored to analyze data of this kind. Furthermore, 
they demonstrated the correspondence between their model and the model 
given in this book. The reader should find Miller and McGill's more 


intensive discussion rewarding. 
In this chapter we are concern 


learning experiment. The experiments 
of N monosyllabic words is read aloud to a subject. The subject is then 


instructed to write down all the words that he can recall. The experi- 

menter gives him no indication of how well he has performed. Then the 

order of the words is randomized, and the procedure is repeated. The 

experiment is continued in this way for many trials until the proportion 

of words recalled nearly reaches an asymptote. The experiments con- 

ducted by Bruner, Miller. and Zimmerman used lists of 4, 8, 16, 32, and 
217 


been of great importance 


ed with this one type of free-recall verbal 
are conducted as follows. A list 


218 FREE-RECALL VERBAL LEARNING cH. 10 


64 words, and the experiments were continued in several cases for as many 
as 32 trials. In this chapter we analyze data obtained from one such 
experiment. 


10.2 IDENTIFICATIONS AND ASSUMPTIONS 

A basic assumption made by Miller and McGill in analyzing data from 
the experiments just described is that the N words on the list are indepen- 
dent of one another. This means that the subject’s ability to recall a 
particular word does not depend upon his past or present performance 
with respect to any of the other words, We number the words by i= 1, 
2,---, N. These are merely labels; they do not represent the position 
of the word in the list as these positions change from trial to trial. 

Consider the ith word. On each trial this word either is recalled 
(written down) or not recalled. Denote correct recall of a word by re- 
sponse 44, and non-recall by response Ay. The probability that the ith 
word is recalled on trial n is Di, In the data we observe a sequence of 
Ay’s and A,’ (recalls and non-recalls) for this ith word. The probability 
Pin depends upon this sequence up to trial n, and by assumption does not 
depend upon the sequences of 4,’s and Ay's for other words. We have 
N sequences of A,’s and A's, one for each word on the list. Since the 
words are assumed to be independent, we can think of these N sequences 
of responses as generated by N different subjects. 

We do assume, however, that all words have the same initial probability 
of recall, po, and that all words have the same learning parameters A; 
anda, This is analogous to assuming that we have a group of N *'identi- 
cal" subjects. We are willing to make such a drastic assumption because 
the experimenters have gone to considerable trouble to arrange for the 
words to have approximate equivalence. All words were English and 
monosyllabic, the words were read aloud at a uniform rate, they had been 
equated for their "articulation value," and the order was scrambled 
between each presentation. This does not mean, of course, that the 
recall probabilities, Pin are the same for all words on trial n for n 0. On 
the other hand, it does imply that if the Sequence of 4,’s and Ay's for two 
words happen to be identical up to trial n, then the recall probabilities 
for these two words on trial n will be equal. Roughly speaking, this 
assumption means that the words are equally difficult and that position on 
the list is irrelevant. These assumptions are not obviously correct by 
any means, but they represent idealizations and simplifications which 
permit a relatively simple analysis of the data to be made. The test, of 
course, is how well the model reproduces or simulates the data. 

The next basic assumption made by Miller and McGill is that a non- 
recall of the ith word does not change its probability of being recalled on 


SEC. 10.3 THE DATA 219 


the next trial. Thus the operator Qs, which is applied when A, occurs 
is the identity operator, that is, «e = 1, and so i à; 
(10.1) Pis = Qo Pin = Pis 

In the next section we examine some data to see if assumption 10.1 is 
reasonable. When the word is recalled, that is, when A, occurs, we apply 
an operator Q, of the general form 

$03) Basi Qi Pira = Pod (1 — a)4. 

This equation shows that when the recall probability is less than Z;, 
recall will increase that probability. Hence we are simply assuming that 
there is a “practice effect" from writing down a word. 

The two classes of events, recall and non-recall, change the recall 
probabilities as defined by equations 10.1 and 10.2. We assume that no 
other events alter these probabilities. Hence we can apply the analysis 
given in Section 8.5 for the special case of one identity operator. The 
alternative analysis given by Miller and McGill [4] could be used instead, 
but for consistency and continuity we draw from the results given in 
Chapter 8. j 

Until Section 10.7, we make one further assumption which limits the 
range of applicability of the model. We assume that the subject can 
learn the list of words perfectly, that is, that the proportion of words 
recalled approaches an asymptote of unity. Thus, we take 4, = 1. The 
Operator Q,, of equation 10.2, then becomes 
p i Pisa = Qa Pin = % Pin F (1 — %). 

ES n will simplify the estimation procedures given later. Only 
be A anune the initial probability pọ and the parameter a, remain to 
sumated from the data. 


10.3 THE DATA 
ned by Bruner, 


The particular data we analyze in detail was obtai 
Miller, and Zimmerman [3] and was discussed by Miller and McGill [4]. 
A subject was read a list of 32 monosyllabic words and afterwards was 
asked to write down all the words he could recall. The order of the words 
was changed and the procedure was repeated. In this way the experiment 
gives a sequence of recalls and non-recalls for each of the 32 words. 


(Between some trials, the subject was given an “articulation test"—the 
uced, and after 


Words were read over a telephone system with noise introd 
each word was read the subject stated what he had heard. We were 
advised by Miller and McGill that their detailed analysis of the data 
Showed that these tests did not influence the learning 
way.) 

We find it convenient to represen 


data in any detectable 


t the data by a set of random variables 


220 FREE-RECALL VERBAL LEARNING cH. 10 


Xin We let x; „= 1 if the ith word is recalled on trial z and let d, = 0 
if that word is not recalled on trial n. Thus for each word we obtain a 
sequence of l's and O's. In Fig. 10.1 we show the proportion of words 
correctly recalled on each trial. For purposes of analyzing the data we 
need to know the number k of previous recalls of a word on each trial, 


and so we provide these data in Table 10.1. (The value of x; „ for trial n 


"T 


0.8 


0 4 8 12 16 20 24 28 32 
Trials, n 


Fig. 10.1. Observed Proportions, p,, of recalls on each trial from the Bruner-Miller- 
immerman data, and curve of the means, Vin, computed from equation 8.51, with 
24 = 0.86 and Po = 0.22. 


can be obtained by subtracting an entry in column z from the entry in 


columnn + ]. For example, for the first word, there was no recall on trial 
9 because both trials 9 and 10 h 


recall on trial 10 because trial 1] 


nd the recall parameter 2 
25 — l is consistent with the data. 


first six trials. On trial 0 there are, of cou 
recalls; from Table 10.1 we see that 9 oft 
the remaining 23 words. 8 are recalled for 
remaining 15, 3 are recalled for the first tim 
we show these numbers for trials 0, 1; 2, 3.4.5. Now according to our 
null hypothesis, equation 10.1, the Proportion of recalls will be the same 
on all trials. Thus the expected number of recalls in Table 10.2 is 27/98 
times the number of words with zero Previous recalls on each trial. We 
now can test for goodness-of-fit by the conventional Pearson 7? test. 


the first time on trial 1 ; of the 
e on trial 2, etc. In Table 10.2 


221 


THE DATA 


sec. 10.3 


er srier|er|er|ezr| um] m|or|ie |s |s |z |o Is z t je |z |z pt |t |z |z|z|z| 1| t]olololo o IPA zE 
OZ | 61 | BE | 81 | 8t |:8t | £t |91| ST] pr] er|eviv]|or|e |s |z jo Is |e Je le It lo ojojojojojojojolo asn Ig 
OE | Gt | 8z | Lz | 9z | sz | pz | ez | ec | iz Oc | Gt) Br] ct jor) st) st] tr) er) cr) trrjor|e |s|zlo|s|re t|z|trjoj|o 901 OF 
ye) E| ee | ic | oc | oc | ot | sr | ci j| or] si| or|e (s |z jo |s |e | eleli ISESESESSQETE URIS GT 
LT | 9% | st} tc | ec | ec | iz | oz | 61 | SE | Z0 | 91 StU| et} et] ci] i} orl} 6 8 8 titi|9e|s|veie|elz|ritiit 0 anus gc 
OF | 6z | 8z | cz | oz | sz | sz | ez | zz | ic oc | er |si] ii| orj si| sifoi] er zer (o or|e sli 9|s|t|t|t|z|tio doy LZ 
Ol) st} SENET i ery} IT] OE ES 8 b L 9 s s s LAE. T v t t t HBESESESESESRETETT IE] 9wos 9c 
At | BT | el.) | St] cr lorie |e le le leds le le ie le fe tr la le fe Valera ft T|rT|TI[T/[0/0 qni ez 
Hizi|srieierieriujor|e [sg je |z fe te Je |e jo ls le (e le [a [o [ol ololo 9|ojolololjo 95H pz 
gt) pe | ec ez | re | oc  er|sr|zr|or|sr|er|er|er|er rrj or 6 |s jc jo |s is isiti? £|z|t|tit|olo0 wI (c 
BRIE Ie ISIS |$ I$ |g |f |£ |e |t | E | 1 fo fo Jo |o [o [€ |o lolelels 0|0|0|0|0|0 Bea zz 
Te | le} oc) ot} BE j cr} or) st) or} er}arlirjorj6 je |s js |z |z lo |s |s |e |e|ele o z|z|t|t|o|o]| usnd iz 
Oc) 6t | 61 | Gt] sr|erior|srier|erier|m]or|e |ó |é |s |z |o ls |e le le |ele e e|z|ct|t|t|it]o sisod Qc 
8z | LZ | 97 | st | ve | ec | ec} ie} oz |er|sr|zr|or|sr|er|er|er|er| vr|or|e |s Iż |9|si|s els T)t) tt) t}o] sued er 
SS | ez) ic} oz} or) gt} cr} or) st) er} er] Et arj u jore js |Z |e jo |s |e le lcle elels 1|ojojolo 10u gl 
Oe | Gt | sr zr [or | er sr|er|erjer|er|erjzr| wr ]|or|e [e |s |z jo is |s |s lekeli glz £|1|110/0 xoou L1 
It | OE | oz | ge zc | 97| sz] ez] ec| ec ic | oc | er sri er|or|sr| er | er|lzi| iu jor[e [elslz 9|s|r|t|z|tio EET 
eise epe wjorjerlée s jz je |o |9 is le le |e |e le [e |z |t |tltltli ojojojojojo PANY SI 
SIP E NE IS I£ TEIE [E IS [$ 18 tele le E ig lA ttle [E dy 1X HAUT 1|[11010|0|0]0 deay vt 
0t|6 8 L 9 s * m T LÀ t t t t t £ t t £ t £ t t t|t|z|c|t| i| riololo 9018 £[ 
GL APLE j er) er) i ENG |e jz to |s jo fe [e |e [e |e le le |e |e | i leis 1|1]0]0|0|0]|O0]| pne zt 
SZ | sc) ec z|:ujojer|srtziorisr|er[er|er|u]|orjorje |é js jz |z |9 Isib £|[z|1|0|0|0]|0]0 ues 11 
tz | 9z | sz | re] ec | ec) iz | oz érj sr zr|or|sr|er|er|er| un jor|e |s |z lols | ee £|t|z|t|t|t|t|o]| weap of 
9T | S? | ee | ez | cc | ic| oz] ot} gt | cr} or) st | tt] cr] ze i} orl 6 6 8 £19 S S|ve|t£|t|e|t|t|tit]|o 940] 6 
Ol] ol] or} 6 6 6 |6 |6 8 8 L L L L |9 $9 s s P € JT c I VIET] ty ty rlolojo|o| asuva 8 
pS a a lE Eri Vu ba dx i dE D dil eS EC aha ee a by sess L 
97 | st) ec | ec | ec | iz} oz] or} sr | zr|or| sed or] ec | a IL | 01) 6 8 L|9 Hi t £|[z|[t|ir|tit|t]|ojolo xoq 9 
PEERI ER i EE NOT I$ £r ILI ee de le le le le |* le le [€ dele c|c|z|zt|1|0|0]0)]| %unoq ¢ 
O | St) st) er) er[orisr|erierierjzer|m]or|e |e lz jo |se |e le le |t ot | tls 1|0/0|0|0|0|0 0 xSUq p 
LT | 9% | sc be | fc | eC | iz | Oc | or | or | St | SL | £t | Ot | St yjfr|zr[urjorjo |8 L|9|s|tie|te|te|z|t]|olo neq £ 
bl) yt] er] ct | tr] it | or] 6 8 L 9 H T t c [4 [4 T T [4 I l I Tit} t}o}ololojojojo peq z 
$c | 8E | LE | 9z | sc | rz | ez | ec iz | oc érjsijifoi| si| sifil eriz nlor|e js |e L|9|s|t|t&|z|t|ojo Eran 
TE | i£ | o£ | 6z | sz | £c | oz | ec | ez | ez | ec | ic oz|er|sr|zr|or|sr| st} er] cr} ir | or 6|8 zi OSs] ee) Ss) pio PIOM 
— 
u peng 


‘utunjos yey} ur Ánuo əy} ur popnjour jou Sr uwun 


ou) je) suvour 


SnorA2uq,, 


Jo» € jo prog əy} je porequinu peny əy} 10] eur05jno 


TRH} uova 10] plom uova jo sesar snoraid jo soquiny ‘rOl 3T8V.L 


222 FREE-RECALL VERBAL LEARNING cu. 10 


We obtained 7? — 2.87; the number of degrees of freedom is 5, and the 
probability obtained from a 7? table is about 0.72. Therefore, we consider 
the fit entirely adequate, and hence we have found evidence to support 
the assumption that v, = 1. 

In the following three sections we discuss some procedures for estimating 
the parameters p, and a from the data. Perhaps the ideal way to present 
the estimation procedure would be to choose one method of estimating 
and carry it through for the particular data. If we did this, we might 
leave the reader with the mistaken impression that there is just one way 
to estimate. Instead of proceeding dogmatically, we present various 
approaches to the estimation problem for these data and discuss each 
briefly. 

10.4 ESTIMATION OF p, 
First consider the problem of estimating the initial recall probability po. 


The obvious estimate of p, is the proportion of words recalled on trial 0. 
This proportion, which we call zy, is given by 


1 
(10.4) n= 


i=l 


We can think of the initial experimental trial 
trials with a probability py of a success. 
coins, each having a probability py of falling heads on a single trial. 
The proportion of coins, xo, which come up heads is an estimate of po. 
This estimate is unbiased, and is the maximum likelihood estimate of Po 
when only the data on the initial experimental trial are considered. From 
the data given in Table 10.1 we find that 


as equivalent to N binomial 
An analogy is flipping N identical 


(10.5) zo = 9/32 = 0.281, 


The variance of this estimate is simply the binomial variance given by 


(10.6) o(p) = Poll — Po) 
x. 


We can estimate this theoretical variance of te 
estimate tọ. If we do this for our data we obtain [ 
— 0.0063. The standard deviation, o(%9), is then estimated by 0.079. 

The variance of the estimate z, is appreciable, and so we would like to 
find a better estimate of pọ. Only the data from the initial trial were used 
in obtaining the estimate x. We can obtain a somewhat better estimate 
without introducing serious complications, however. From the basic 
assumptions being made in this chapter, we know that if an A 


by replacing p, with its 
(x0) = (0.281)(0.719)/32 


2 occurs on 


SEC. 10.4 ESTIMATION OF po 223 


trial 0 (non-recall), the recall probability on trial 1 is still py. (From 
Table 10.2 we see that trial 1 gives an estimate py 8/23 = 0.348.) 
Similarly, if an A occurs on trial 1, the recall probability on trial 2 
remains py. (Again, Table 10.2 gives py 3/15 = 0.200.) We may 
readily use the data for each word on trials up through the trial on which 
recall first occurs to estimate pp. This procedure will utilize considerably 
more data than the separate trial estimates. For each word we have 


TABLE 10.2 
Tabulations used in testing assumption that the non-recall parameter a; = 1. 


Number of Words Number of Those Expected Number 


Trial (n) with Zero Recalled on Recalled on 

Previous Recalls Trial n Trial n 

0 32 9 8.82 

1 23 8 6.34 

2 15 3 4.13 

3 12 2 3.31 

4 10 4 2.76 

5 6 1 1.65 
Totals 98 27 27.01 


merely to record the number of trials preceded by zero recalls and sum 
these for all words; we denote this sum by No. On every such trial the 
Probability of recall is pọ, and so the proportion of recalls on these No 
Word-trials is an estimate of pọ Since each word can be recalled for the 
first time just once in the sequence, the number of recalls during the No 
word-trials is simply N. Thus our new estimate, Bo, Of Po is 


(10.7) Bo = NINo. 

For the data given in the last section we have N = 32 and No= 120. 
Thus 

(10.8) Po = 32/120 = 0.267. 


his estimate. We can 
which is preceded by 
is given by the 


We next investigate the variance and bias of t 
compute the expected number of word-trials, No» 
zero recalls. The probability of obtaining a value No 
Negative binomial distribution [5]: 


(10.9) f(N) = $c] (1 — po!» pg. 


The expected value of No is 
(10.10) E(No) = NiPo- 


224 FREE-RECALL VERBAL LEARNING CH. 10 


We now demonstrate that f, is the maximum likelihood estimate of po. 
We use the standard procedure described in Section 9.8. First we take 
the logarithm L of the expression in equation 10.9, omitting the constant 
coefficient, to obtain 


(10.11) L= (No — N) log (1 — po) + N log py. 


If we maximize this logarithm, we shall have maximized the likelihood of 
obtaining the observed data. Hence we differentiate L with respect to po: 


OL NoN N 

Opo l—po Po 

When this derivative is set equal to zero we obtain for the maximum 
likelihood estimate of py 


(10.12) 


(10.13) Po = NIN,, 


in agreement with equation 10.7. (To this point, the derivation is identical 
with that for the ordinary binomial distribution.) Next we compute the 
asymptotic variance of this estimate from the theorem given at the end 
of Section 9.8. Taking the second derivative of L gives : 


(10.14) EE le? N 
Ops (1— Po? Pò 


equation 10.10, 
(10.15) (=) _ (Nip) — i N 

dpo? (1 — po? Po poa =Po) ` 
Therefore, the asymptotic variance of Po is 


- DB 
(10.16) o%( i.) = a. 


The reader may note that this variance 
estimate ay (see equation 10.6). This seems reasonable because the ex- 
pected number of non-recall trials is Nip; therefore we might expect to 


have the first trial sample size, N, replaced by N/po, as it is. For the data 
discussed above we get the estimate $ 


is po times the variance of our older 


"T 0.267)2(0. 
(10.17) Oppo) = RT — 0.0016, 
and so 
(10.18) o( Ĥo) = 0.04. 


This standard deviation is just half of that obtained for the estimate zo. 


SEC. 10.5 ESTIMATION OF o 225 


We are mildly concerned about the bias of the estimate fọ. We could 
compute the expected value of p, and determine whether or not it was po. 
This computation is tedious, however, so we draw upon a result obtained 


by Girshick, Mosteller, and Savage [6]. They demonstrated that a unique 


unbiased estimate, (Po),, of po when N is fixed and Nj is the only observed 


statistic, is given by 


(10.19) zs ied 
(Pou Newel” 


We do not repeat the proof here, but the reader can show that (po), is 
indeed unbiased by using the distribution function f (Np) given by equation 
10.9, and computing the expected value of (p)),. From our data we get 


(10.20) (po), = 31/119 = 0.261. 


i By comparing this unbiased estimate (equation 10.19) with the maximum 
likelihood estimate fy (equation 10.13) we see that when M and M, are 
large the two estimates are nearly identical since the —1 terms are neg- 
ligible. This establishes that ig is unbiased when N gets arbitrarily large. 
However, fo is necessarily biased for finite M. Moreover, for large values 


of N, since Bo = (po), the distributions of these two estimates must be 
Nearly identical. Therefore the variance of the unbiased estimate (po)u 


'S given approximately by equation 10.16 for large N. 
eee variances of all the estimates of pọ given a 
use om y for small values of N, the reader may wonder why we cannot 
e aga iis of the data to estimate po. In principle this can be done, 
a now from our basic operator, Qı (equation 10.3) that as soon 
. Ord has been recalled once its probability of recall depends on the 
Parameter z,. We have already exhausted all the data which depend 
only upon pe. The data on trials beyond the trial of the first recall 
ae upon both p, and a, and up to this point æ; is completely pele 
g these data on later trials, we have the choice of trying to es ie 
Po and a, simultaneously, or of estimating # on the assumption that po 1S 
known. The latter procedure is used in the next section. 


bove are not small, 


10.5 ESTIMATION OF «, 


In this section we assume that the initial recall probability, Po. is known; 
tion or its value 


it ma 4 » B 
t may be estimated by the procedure given in the last sec 
may be assumed. We wish to estimate the parameter o from the data, 
taking p, = 0.26. 

In Section 9.6 we saw that the mean total number, Ta, of Ag occurrences 


226 FREE-RECALL VERBAL LEARNING - cH. 10 


(non-recalls) may be used to estimate a. Equation 9.21 gives 


= . —l0g po 
(10.21) T3 [—a 
and so 

" —log p 
(10.22) a ee 


Ta 
From Table 10.1 we obtain 7, = 11.59, and this value along with pọ = 0.26 
gives 


(10.23) &, = 0.884. 


Thus we have obtained an estimate of %ı very cheaply. Nevertheless we 
use the more elegant maximum likelihood procedure described in Sections 
9.9 and 9.10 to estimate a. 

Equation 9.60 is appropriate for this problem, provided that we let v 
stand for the number of previous recalls of a word, x, denote the number 
of words recalled at least v + | times, and N, the number of word-trials 
required for the x, words to be recalled y + 1 times after each has been 
recalled » times. For example, with three words and six trials each, we 
might observe the sequences shown below: 


Word 1 011001 

Word 2 100100 

Word 3 001110 

(A 1 denotes a recall and a 0 a no 

3 are recalled v 4- 1 — 3 times; 

for the third recall after the sec 
Such trial. Thus, N, — 4. 

We consider in equation 9.60 to be CA 

the statistic D, defined by equation 9.63: 


Q 
(10.24) D, = Y XN, — x). 
0 


n-recall.) Then, for » = 2, words | and 
hence a, — 2, Word 1 required 3 trials 
ond recall, whereas word 3 required one 


in the present model. We need 


In Table 10.3 we tabulate the values of N, for v —0,1,-..,8. These 


numbers were obtained from Table 10.1; N, is simply the pilier: of 


times the integer v appears in the entire table, For the above values of v, 
the quantity z, has the value 32, the number of words on the list, for we 
see in Table 10.1 that every word that has 8 previous recalls is recalled at 
least once more. (Word number 14 has the smallest number of recalls.) 
By stopping at » = 8 we have chosen the upper limit O of the sum in the 
last equation to be 8. We make this choice of Q so that we can use the 


second procedure given in Section 9.10, that is, we have chosen Q — 8 in 
order that x, — x — 32. 


i 


SEC. 10.5 ESTIMATION OF 94 227 


TABLE 10.3 
Tabulations used in estimating parameters. 


$ IN wx Pr N,—2 *N,—2) 
0 120 32 0.2605 88 0 
| 127 32 0.2460 95 95 
2 76 32 04133 44 88 
3 62 32 0.502 30 90 
4 49 32 0.6458 17 68 
5 49 32 0.6458 17 85 
6 34 32 0.9394 2 12 
7 57 32 0.5536 25 175 
8 38 32 0.8378 6 48 
9 43 341 07143 324 661 

10 29 27 0.9286 

11 30 27 0.8966 

12 28 27 0.9630 

13 35 27 0.7647 

14 26 25 0.9600 

15 28 23 08148 

16 21 21 1.0000 

17 20 19 0.9474 


From Table 10.3 we see that D, = 661. From equation 9.69 we then 


have 

10.25) G(à,, qu, 2) = (Dix) = (661/32) = 20.66. 

D. For our present problem, qo 
From Table D, 


The function G is given in Table = 0.74 
and Q = 8, and we wish to obtain %. 


G(0.86, 0.74, 8) = 19.49, 


G(0.88, 0.74, 8) = 23.08. 
Since we want the value of & for which G = 20.66, we then interpolate to 
obtain 


(10.27) &, = 0.867. 
the one given by equation 10.23. 


(10.26) 


This estimate is quite close to 


228 FREE-RECALL VERBAL LEARNING cH. 10 


We can estimate the variance c*(2,), of the estimate just obtained, from 
equation 9.82. The computation is straightforward since we know the 
values of N, from Table 10.3 and we replace z with the above estimate, 
0.867, and take gy to be 0.74. The numerical computations yield o°(%) 
= 0.00016 (o(%,) = 0.013). This is probably an underestimate of the 
variance of 4, since qy is not actually known, as was assumed in the 
computation of that variance. 


10.6 SIMULTANEOUS ESTIMATION OF q, AND a, 


In Section 10.4 we estimated the initial probability py from that portion 
of the data which was independent of g}, and then we used this value of po 
to obtain estimates of a, in Section 10.5. We proceeded in this way 


0.88 mp 


0.87 


ay 


0.86 


0.85 


0.76 0.77 0.78 


To 


Fig. 10.2. Interpolation diagram for obtaining simultaneous maximum 
likelihood estimates of qo and a. 


mainly for heuristic purposes. As already suggested in Section 9.11, we 
can obtain simultaneous maximum likelihood estimates of qo and a by 
using Table D. We obtain these estimates now. ° 


We need the statistic D, previously obtained and the statistic 
[e] 
(10.28) Da — N(N,— +). 
»—0 ji 
From Table 10.3 we see that D, — 661 and D, = 324. The estimation 
equations (9.77) then become 
G(5. Jas 8) = (Dj) = 20.66, 


(10:23) F(,. Gor 8) = (Dal) — 10.13. 


sec. 10.6 SIMULTANEOUS ESTIMATION OF (o AND % 229 


The functions F and G are given in Table D, but we need to make a 
double interpolation to obtain the values of &, and ĝo. We assume that 
such interpolation schemes are not too well known, and so we give the 
details of one method here. 


The procedure involves plotting x, Versus do for F — 10.13 and again for 


G = 20.66. The two curves will intersect at the point corresponding to 
the desired maximum likelihood estimates 2; and ĝo- First consider the 
function F. We see that for qo = 0.76, the value F — 10.13 falls between 
a, = 0.86 and a, = 0.88. We interpolate and find that F= 10.13 for 
a, = 0.871. For qo = 0.78, we interpolate between a; = 0.84 and a, = 
0.86 and find that F= 10.13 for a, = 0.854. Then we repeat the pro- 
cedure for G and find that when qo = 0.76 we get G = 20.66 for a, = 0.861, 
and that when qq = 0.78 we get G = 20.66 fora, = 0.855. We then plot 
these values as shown in Fig. 10.2 and draw straight lines between the 
points. The line for F == 10.13 crosses the line for G = 20.66 at the 
point (0.856, 0.778) and so we have finally for our maximum likelihood 


estimates 
fio = 0.222, 


(10.30) — 
&, = 0.856. 


These estimates can be compared to our previous estimates as summarized 


in Table 10.4. 
TABLE 10.4 


Comparison of the several estimates of the parameters Po, 
in parentheses in a particular row are values assumed in o 
shown in that row. 


a, 4. The numbers 
btaining the estimates 


Procedure Po 8, Ay 
Proportion recalls on trial 0 0.281 — — 
Maximum likelihood (early trials) 0.267 — = 

0.261 n 


Unbiased (early trials) "- (1.0) 
Mean total number of non-recalls (0.26) 0.884 (1.0) 


Maximum likelihood (0.26) prd (1.0) 
Simultaneous maximum likelihood 0.222 0.856 982 
0.228 0.850 asa 


Minimum chi-square 


Having estimated the parameters % and po, We are now ina position to 
make some comparisons between the model and the data. We used the 
values z, = 0.86 and p, — 0.22 and computed values of V,,, from equation 
8.51 of Chapter 8. The computed curve, along with the experimental 


points, is shown in Fig. 10.1. 


230 FREE-RECALL VERBAL LEARNING cu. 10 


To make further comparisons, we ran 32 Monte Carlo computations for 
32 trials each, as described in Section 6.2. The parameter values x, = 0.86 
and p, = 0.22 were used in making those computations. We computed 
several statistics from these runs, and the comparisons with those obtained 
from the data are shown in Table 10.5. 


TABLE 10.5 


Comparison of statistics of the Bruner-Miller-Zimmerman data and the 
Monte Carlo runs, computed with py = 0.22, a, = 0.86, 4, = 1. 


Data Monte Carlo 
Mean trial of first recall 2.75 3.66 
Mean trial of second recall 6.72 7.19 
Mean total number of non-recalls 11.59 11.22 
Standard deviation of number of non-recalls 6.74 6.07 
Estimate of (pp), from equation 10.19 0.261 0.209 


10.7 IMPERFECT LEARNING 


In previous sections we have Tegarded the list as ultimately being 
recalled perfectly. It is not absolutely necessary that we so regard this 
list-learning task. Furthermore, for other purposes it may be valuable 
to have a procedure for estimating the limit point A, because not all tasks 


will ultimately lead to 100 percent performance, even after quite extended 
practice. 


If the list of words is not learned 
à parameter 2, in the recall operato 
of the recall proportion. 


perfectly, it is necessary to introduce 
r Q, to correspond to the upper limit 
The operators under such imperfect recall are 


(10.31) Op = 2 p + (1 — a), 
Oop = P. 


where Q, is associated with recall. 


and Q, with non-recall. When a 
word has been recalled exactly » 


limes, its probability of recall has reached 
(10.32) P. Qi'po = % "py + (1 — 2")À,. 


As before, we let x, be the number of words recalled at least y 4 


-+ 1 times, 
j 
and let N, be the number of word-trials required for those x, words to be 


recalled v + 1 times after each has been recalled » times. (For small 
values of v, x, will equal the total number of words, but some words are 


SEC. 10.7 IMPERFECT LEARNING 231 


never recalled more than, say, eight times.) Analogous to equation 10.19, 
an unbiased estimate of p, is given by 


r,—1 
10. js e 
(10.33) b. Xi 
and 
21 — p,) 
(10.34) ees pXL— Pe : 


Knowledge of these quantities suggests the possibility of minimizing 
a 7-like quantity 


[e] 
(19.35) a E 
a $ W [xpo + (L — 083 — BP 
where ^i 
(10.36) W, = MoX ,), 


a i i imation purposes 
and Q is arge of recalls retained for estima : 
the largest number < involve the parameters being 


(for these data we use Q = 25). The W^ ded a5 eit 
estimated, but it is worth noticing that the W's may be COM is E 
Small changes in weights usually make relatively ure at temporarily 
Problems, so we can obtain an approximation for the W 8,8 ius inni 
pretend they are constants. To obtain the eae T e 
differentiate 72 partially with respect t0 Po A,, and o, anc S 

Equal to zero to obtain from Po ^v and %, respectively, 


(10.37) y Wale po + (1 — 20 — Bl = 9 
(10.38) S WL — eaP 3-0 — a4 — Bl = 9 
: ai —£] 2:0, 
(10.39) ay) = X Wve" [n o + (aD - Ad 


" 


Wn i first 
s , ion. Using the 
Where (%1) is introduced for convenience as a Puppen 
9f these equations in the second, and rewriting, We E 


S SED 
ndn 35s X Wap X HL me Wip 
‘ i nes S Eas 
(10.41) Po Wa? +4, > Wal — % ‘Vi e W, Bg 
(10.42) Koy) = py X W, sat 4 s s Wu d =) 


» 


X Wow, 0. 


232 FREE-RECALL VERBAL LEARNING cu. 10 


We have already described how to obtain the f, The W, can be 
estimated by choosing a fairly close fitting set of parameters and computing 
successive values of the variance for that set of parameters. A freehand 
fit to the curve of f, against » could be used to approximate the p,, but it 
would seem better to use a functional form that is of roughly the right 
shape. We chose py = 0.26, x, = 0.85, 4, = 0.98 for this purpose. As 
v increased, the factor (1 — p,) decreased, but the number of words 
available also decreased, leaving the weights roughly equal for all v 
(except v = 0, where the weight was twice the average). So, as a first 
approximation, it turns out that the W’s can be dropped altogether. 

For any given value of z,, equations 10.40 and 10.41 are a pair of linear 
simultaneous equations that can be solved forp,and A. When numerical 
values for all three parameters are then substituted into the third equation, 


we get a value of £(z,), usually not zero. For example, for the word data 
we find: 


a Ela) Po A 
0.83 1-0.0734 0.197198 0.966951 
0.84 —0.0262 0.210704 0.976207 
0.85 —0.1342 0.224892 0.986632 


Interpolating on x, to make (%) vanish, we find as estimates po & 0.207, 
94 5:0.837, 2, & 0.974. This value of A, is quite close to unity, as would 
be expected. 

In obtaining these estimates we used the d 


à ata of Table 10.3 for the first 
25 recalls. By using so man 


y recalls we are ultimately reduced to as few 
as 10 words for estimation Purposes, and it seems wise to stop there. 
Of the 1024 Observations, 997 were actually used (the cut-off at Q — 25, 

and the estimation procedure forced the discard of 27). 
If the W^s are taken to be equal, the summations not involving p's are 
readily obtained: 
Q 


A = mg 041 
(10.43) X, = bi aq = T bm. dl 


»=0) l € "a 
[e] 
— y 2249 
(10.44) E, = a? = N Ti 
lua 


[e] 
(1045) X,— > sy? ae La ~~ axo Tj? 


- da? à; 


, 


(10.46 i= » vagi &(1-— a 29592)... (1 — aXQ + pee 
Za (I — ag 


233 


SEC. 10.7 IMPERFECT LEARNING 
[e] 

(10.47) E= Y(l—a’)=N+1-—4, 
r=0 


(10.48) Z= Xa'(1— 2a") = X, — E» 


(10.49) B => vx =a") = X,— E, 


Furthermore, when equal weights are used, the only tables that must be 
constructed are those for c," and va," !, for each numerical value of o, 
used, and v= 0, 1, *--, Q. But when the W’s are taken as unequal, it 
is necessary to construct tables of g”, %4°", va, 7, and v”! for v = 0, 1, 
***, Q, to facilitate computation of the summations involving the Ws. 

The first set of results obtained with equal W^s was used to generate a 
new set of approximate p,'s, » = 0,1,:--,25. From these p's new 
weights were obtained. The minimizing equations were solved using 
these unequal weights. The results were fy — 0.231, &, = 0.850, 
A, = 0.981. Using these results a second set of unequal weights was 


computed, and this last iteration step gave finally 


Po = 0.228, 
(10.50) &, = 0.850, 
1, = 0.982. 


These results are so close to the penultimate set, that the iterative process 
was abandoned at this point. Though it was worth keeping a third 
decimal for the purposes of iteration and computation, we know that the 
unreliability of the estimates is such that rounding to two decimals does 
them more than justice. It is worth noticing that the results for f, and & 
are extremely close to those obtained by the maximum likelihood pro- 
cedure where 2, was assumed to be unity. This is no surprise now that 
we know 4, is so close to unity because minimum z is essentially maximum 
likelihood. However, we did use much more data in the minimum 7? 
approach, but, of course, the extra data from the later part of the experi- 
ment were made up for in the maximum likelihood procedure by the 


direct assumption of unit 24. 

When the parameter values f, = 0.231, &, — 0.850, A, = 0.981 were 
used to compute 7? in equation 10.35, the value turned out to be 56.3. 
This is a rather large value for 23 degrees of freedom (about four standard 
normal deviates out). In Fig. 10.3 values of f, are plotted against v. 
In this figure sources of the large value of 7^ are apparent; for v = 6and 7, 


234 FREE-RECALL VERBAL LEARNING cH. 10 


the observed proportions differ from the computed ones by about 0.24 
and 0.19, respectively. The weights associated with these points in the 
x? computation are about 220. Thus these two points are contributing 
about 20.6 to the 56.3. There certainly is, therefore, more oscillation 
about the mean curve in the data than can be accounted for by the built-in 
variation in the model. If a subject learns words in clusters, rather than 
independently as assumed in this model, then one word in a cluster could 


10 


08 


10 
M 15 20 25 


Fig. 10.3. Values of the estimates, P» (circles joined by straight lines), of the recall 
probability after v recalls, obtained from equation 10.33, and the values of p» (smooth 
curve) computed from equation 10.32, with Po = 0.231, a, = 0.850, and A, = 0.981. 


though we have not done so, it mig 
would include such cluster-learning. 


10.8 EVALUATION OF THE MODEL 
Now that we have presented our first detailed application of the mathe- 
matical system described in Part I, we are in à position to evaluate its 
usefulness in more concrete terms than we have done heretofore. We 
delay a general evaluation until Chapter 15, but we now try to evaluate 
the model given in the present chapter. What has been accomplished 
with the model that could not have been done by routine analysis of the 


X 


SEC. 10.8 EVALUATION OF THE MODEL 235 


data? In attempting to answer this question we make three main points: 
(1) The model gives a more detailed description of the data than is ordin- 
arily done, (2) the model yields a concise summary of the data which 
simplifies parametric studies, and (3) the model provides a baseline for 
studying more subtle effects, or differences in stimuli or subjects. 


l. THE MODEL GIVES A DETAILED DESCRIPTION OF THE DATA. Most 
learning theories do not provide a framework for analyzing data at the 
level of single subjects and single trials. The model given in this chapter 
does give a probabilistic description of the data in terms of the probability 
of recall of a single word on each trial. By such a detailed analysis we 
can compare the relative effects of recall and non-recall of a word on the 
Subsequent recall probabilities. We found that our assumption that 
non-recall had no effect was justified, that is, the data gave no evidence to 
reject this hypothesis. We also found that recall of a word reduced the 
probability of non-recall to about 85 percent of its previous value. 


2. THE MODEL LEADS TO A CONCISE SUMMARY OF THE DATA. The data 
are completely specified by the model and the values of three parameters, 
Po %, and A,. From these parameters the model can generate a large 
number of statistics of the data. For example, the mean and variance of 
the number of recalls in a given number of trials, the mean and variance 
of the trial of the first recall, the mean and variance of the trial of the 
second recall, etc., can be computed from the model once we have esti- 
mated the three parameters. In this sense, the parameters po, %, and à; 
are "basic" quantities. Furthermore, the psychological meaning of 
these parameters is not obscure. The parameter values may be considered 
Summary statistics or properties of a particular set of data and may be 
used in making comparisons between various sets of data. Individual 
Subjects can be compared easily, and the effects of changing the number 
of words or the speed of presentation of the words can be measured 
readily in terms of the parameters. 

3. THE MODEL PROVIDES A BASELINE FOR STUDYING EFFECTS OUTSIDE 
THE MODEL. Several simplifying assumptions were made in constructing 
the model, and the validity of such assumptions depends on various 
experimental conditions. For example, we assumed that the words 
Were equivalent for the subject—were equally difficult to learn. The 
comparisons between the data and the Monte Carlo runs, shown in Table 
10.5, tend to support that assumption, but this is undoubtedly a result 
of the care taken by the experimenters to equate the words. But the 
model provides a way of measuring how well they succeeded! From 
Table 10.5 we note that the standard deviation of the number of non- 
recalls was 6.74 for the Bruner-Miller-Zimmerman data and 6.07 for the 


236 FREE-RECALL VERBAL LEARNING cH. 10 


Monte Carlo runs. Had the experimenters exercised less care and skill in 
equating the words, we would expect that the variance in the number of 
non-recalls would have been considerably larger. The experimenters 
scrambled the order of the words on each trial to eliminate serial effects, 
and our analysis indicates that they were successful. Experiments in 
which serial effects are present could be analyzed in the same way. In 
such cases, of course, the model would not be expected to reproduce the 
data accurately, but it could be used as a baseline for measuring the 
magnitude of the serial effects. Another possibility is that an experi- 
menter might want to include some “loaded” or "emotionally toned” 
words in his list and find out if the subject had greater or less difficulty 


with such words. Again, the model provides 


a baseline for measuring 
the magnitude of such effects. 


In the chapters which follow we have further Opportunities for making 


evaluations such as those just given. The main points are the same, but 
the context is different in each case. 


10.9 SUMMARY 


An experiment on verbal learning with serial effects minimized is 
described, and the data are analyzed. It is assumed, following Miller 
and McGill [4], that non-recall of a word does not change the recall 
probability p. This assumption is supported by the data. Hence when 
non-recall occurs the operation applied is 


Qp = p- 
When recall occurs, the operation is 


Qi p — * p + (1 — a). 
Thus, the analysis given in Section 8.5 is applied. Both the cases of the 


limit point 2, = 1 and 2, <1 are considered. Several procedures for 
estimating parameters are described, and goodness-of-fit is measured. 


REFERENCES 
1. Hovland, C. I. Human learning and retention. 
psychology, S. S. Stevens, ed., New York: Wiley, 1951, pp. 618-624. 
2. Hull, C: La Hovland, C. I, Ross, R. T., Hall, M., Perkins, D. T. and Fitch, FE 
Mathematico-deductive theory of rote learning, New Haven: Yale University Press, 
1940. 


Handbook of experimental 


3. Bruner, J. S., Miller, G. A., and Zimmerman, C. Discriminative skill and discrimina- 
tive matching in perceptual recognition, J. exp. Psychol., 1955, 49, in press. 

4. Miller, G. A., and McGill, W. J. A statistical description of verbal learning: 
Psychometrika, 1952, 17, 369-396. 

5. Mood, A. M. An introduction to the theory of statistics, 
1950, p. 61. . 

6. Girshick, M. A., Mosteller, F., and Savage, L. J. Unbiased estimates for certain 
binomial sampling problems with applications. Annals of math. Stat., 1946, 17, 13-23. 


New York: McGraw-Hill, 


CHAPTER Il 


Avoidance Training 


11.1 INTRODUCTION 


ents of Bekhterev on the conditioned with- 


Since the original experim 
ry, a number of 


drawal response, reported in the early part of the centu 
Studies of avoidance conditioning have been made [l,2]. In these 
experiments the response which is learned prevents the appearance ofa 
Noxious stimulus such as an electric shock. Responses such as with- 
drawal of hand or foot, running, and jumping have been conditioned in 
this way. In order to establish these responses, it was necessary of course 
to shock the animal on the first few trials and to present a warning 


(conditioned stimulus such as a buzzer or tone) on each trial. 

The results of avoidance training experiments have been interpreted as a 
Combination of "classical" and "instrumental" conditioning [3]. The 
electric shock is regarded as an unconditioned stimulus—the withdrawal 
Tesponse is a reflex—and the warning stimulus is considered to be a 
Conditioned stimulus which acquires the ability to evoke the response 


through classical conditioning. The fact that the response causes a 
N interpretation that instru- 


Cessation of the noxious stimulus leads to the 
mental conditioning is also involved. Escape from shock reinforces the 
Withdrawal response during the early trials, whereas escape from the 


Conditioned noxious stimulus reinforces that behavior on later trials. 
avoidance training has been given by Miller 


Another i i 
interpretation of ; NUR 
[4]. Fear is considered to be a learnable drive, and fear ion i 
assumed to bea reward. The early trials of avoidance training presume y 

a drive, whereas both escape 


establish fear of the conditioned stimulus as wher e 
and avoidance result in reinforcement of the behavior involved, escape 


because it leads to pain reduction and avoidance because It leads to fear 


Or anxiet $ 

xiety reduction. : . . 

We shall attempt to describe the results of an avoidance ape 2 

terms of a simple model, which is an application of our anie wc 

N each trial the animal either avoids or does not avoid shock. voiding 
237 


238 AVOIDANCE TRAINING cH. 11 


will be considered response A, and not-avoiding response A, Not- 
avoiding is necessarily followed by shock and ordinarily by escape from 
that shock. The occurrence of A4, will have a specified effect on the 
probability p of response A, ona trial; our operator Q, describes this effect. 
Also, the occurrence of an A, will have another effect on p. and this effect 
is described by Q. Hence we consider this type of experiment as an 
example of two subject-controlled events. Before spelling out this 
application in detail, we describe an experiment on avoidance training. 
Later we estimate the parameters in the model from the data obtained 
from that experiment. 


11.2 THE SOLOMON-WYNNE EXPERIMENT 
In a study of "traumatic avoidance learning" 
describe an experiment in which dogs learn to jump a barrier to avoid an 
intense electric shock [3]. The subjects were 30 "mongrel dogs of medium 
size" weighing 9 to 13 kilograms. The apparatus was a variation of the 
Miller-Mowrer shuttle box used for avoidance training of rats. The box 
consisted of two compartments separated by a barrier and a “guillotine- 
type gate," which could be raised or lowered. The barrier was adjusted 
to the height of each dog's back. The floor of the 
steel bars which were wired to the shock circuit. 
The conditioned stimulus (CS) consisted of turning out the lights above 
the compartment the dog was in and simultaneously raising the gate. 
The other compartment was still illuminated. In ten pretest trials none 
of the 30 dogs jumped the barrier during a 2-minute exposure of the CS. 
During training the CS was presented for 10 seconds and was then followed 
by an intense electric shock applied through the floor to the dogs' feet. 
The voltage was the “highest possible without producing tetany of the 
dogs’ leg muscles." The current was about 100 to 125 milliamperes for 
most dogs. The shock was left on until the dog escaped over the barrier 
into the illuminated compartment, where no shock was administered.* 
The gate was closed as soon as the dog jumped. Ifa dog jumped before 
the shock was turned on (during the 10-second period), the trial was 
recorded as an avoidance trial, whereas if the dog did not jump until it 
was shocked the trial was recorded as an escape or shock trial. The 
experiment was designed so that shock could be escaped or avoided only 
by jumping the barrier. Reaction latencies and various qualitative 


* Solomon and Wynne 


apparatus consisted of 


* If a dog did not escape after 120 seconds of sho; 
recorded as a shock trial with "infinite" latenc 
ofthistype. Rather than introduce a third o 
assume that these few trials had the same 
occurred. 


ck the trial was terminated and 
Y. Out of 234 shock trials, only 8 were 
perator and possibly a third response, We 
effect as trials on which escape actually 


SEC. 11.2 THE SOLOMON-WYNNE EXPERIMENT 239 


aspects of behavior were also recorded, but no use of these data will be 


made in the following analysis. 

The experimental record of the 30 dogs is shown in Table 11.1. Trials 
on which shock occurred are indicated by an S; all other trials are 
avoidance trials. (Dog number 66 actually jumped within 10 seconds on 


TABLE 11.1 


Record of shocks for the first 25 trials on each of 30 dogs in the Solomon-Wynne 

experiment. Occurrence of shock is indicated by S, non-occurrence by a blank. 

(The number assignments to the dogs were made by Solomon and Wynne and do 
not imply that a selection has been made.) 


Trial n 


10] 11 | 12| 13 | 14 | 15 16| 17 | 18 | 19 | 20 | 21 


ES 

3 
" 
a 
5 
m 
- 
c 
| 


an 
a 
a 
a 
a 
an 
Ana 


ALAN! nuun 
nnn 


| 


[onnne| unuunu] ununun) nunnu | anny uhun | o 
PA] 


ay win n ula 


u wal 


hon u| & ab| nunnal anne | 
anne | 


fae &| annua] unnnun| nunaala an 
alana | an «| « « 


ban 
PPP 
uu 


[2 


244KH| annnnln uw u| moun] nunun U AnH 


An AN! anny 
a 


nunn 


un 
a lua 


10} 7 | 9 | 1! 
18 | 14 | 12 pel fa el 


w 
a 
m 
a 
oo 


the first experimental trial. However, this jump eo ae 

avoidance but was considered an eleventh pre ria 

E. © Next trial for this dog is labeled trial O in 
ateful to R, L, Solomon for making t 


n 
“Merous helpful suggestions. 


240 AVOIDANCE TRAINING cH. 11 


11.3 THE MODEL 


As already pointed out, the Solomon-Wynne experiment is taken as an 
example of two subject-controlled events. We identify avoidance of 
Shock on an experimental trial as response A, (or event £j) and non- 
avoidance (escape) as response Ay (or event E>). The probability of 
avoidance on trial 5 is denoted by p,. When avoidance occurs on trial n 
we apply operator Q, to p, to obtain P, and when non-avoidance 
occurs we apply Q to obtain p,,,. The probability that the dog is 
shocked on trial n is q, = 1 — Pa It is assumed that a dog jumps over 
the barrier on every trial, that is, the dog either avoids or escapes shock 
on every trial. 

The operators Q, and Q, are, of course, of the form 


(11.1) Qip = «p + (1 — «2; 
but rather than maintain this 


general form we place some restrictions on 
these operators. 


First of all, we see from the data in Table 11.1 that all 
dogs ultimately learn to avoid. Data for trials beyond n = 24 obtained 
by Solomon and Wynne, but not shown in Table 11.1, show that every 
dog in the present experiment avoided without fail on trials beyond the 
twenty-fifth. (Some dogs had as many as 200 trials.) Therefore, the 
operator Qi, which is applied to P when a dog avoids, must maintain p 
very near unity. For this reason we take 2, = I. 

We also take the limit Point 2, equ 
tering of shock to a dog increases th 
during the early part of the training. Furthermore, if 2a were appreciably 
less than unity, and if a shock occurred late in training, the probability of 
avoidance would decrease, making shock more likely on the next trial. 
The data suggest that this does not occur, and so we take 2, — 1. These 
restrictions give us the operators defined by | 


Qip— a p--(1— a) 


al to unity. Clearly, the adminis- 
€ probability of avoidance, at least 


(11.2) (avoidance), 


Qsp — a, p4- (1 — 99) (shock). 
The probability of avoiding is 
are of the type discussed in Se 
therefore commute with one 
given in Sections 8.2 and 8.3. 
It is convenient to use the com 
applied to q, the probability of shoc 
From equation 11.2 we get 


P, of shock, q=1—p. These agendum 
ction 8.2: they have equal limit points ani 
another. We make use of the analysis 


plementary operators Q, which are 
k, and are defined byQ,g— 1 — Qi 


Qıq= %q (avoidance), 
(11.3) 


Q24 = aq (shock). 


SEC. 11.4 ESTIMATION OF THE SHOCK PARAMETER, X5 241 


As we saw in Chapter 8, the probability of response 4s (shock) on trial n, 
when k previous 4,'s (avoidances) occurred, is 


(114) d OF 


where qy is the initial probability of shock. 

We place one further restriction on the model: we assume that the 
initial probability of avoidance, po, is zero. We see from Table 11.1 
that none of the 30 dogs avoided on trial 0. Moreover, as mentioned in 
Section 11.2, ten pretest trials were given each dog during which the CS 
was exposed for 2 minutes; none of the dogs jumped the barrier during 
these pretest trials. (We have already noted that dog 66 jumped on a 
trial considered to be an eleventh pretest trial. For analytic purposes, 
one jump in 301 trials cannot seriously change results based upon the 
assumption that pọ= 0.) Since po= 0 and qọ= l, equation 11.4 
becomes 


n—k, im ky n—k, 
Qo = 91€» Qo 


(11.5) qui = tra. 
Therefore, the model for this experiment contains but two parameters, o4 
and vy. The next two sections discuss the procedures used in estimating 
these two parameters from the data. We note that «, is a measure of the 
effectiveness of an avoidance trial in increasing the probability of avoid- 
ance; ifo, were unity, the trial would have no effect, and if %, were zero 
the probability of avoidance would go to unity at once. Similarly, «s 
is a measure of the effectiveness of a shock trial in teaching the dog to 
avoid. It is convenient to call a the avoidance parameter and a» the 
Shock parameter. We have more to say about these interpretations in 
Section 11.7. 


11.4 ESTIMATION OF THE SHOCK PARAMETER, % 


We first discuss the problem of estimating the shock parameter «s from 
the data given in Table 11.1. Again, we consider more than one method 
of estimating parameters. For æ we use these methods: 

1. Information based on the results of trial 1; same for trial 2. 

2. The mean number of trials before the first avoidance. . 

3. An approximate. maximum likelihood estimate based on trials 


before the first avoidance to assist in obtaining estimate 4. 
4. The preferred estimate, a maximum likelihood estimate based on the 


Same trials as method 3. s Method 
purposes. 


Method 1 is an obvious method for introductory ; 
the estimate is easy to obtain 


was used in the Introduction to this book; 
rouble to get, but 


from tables we provide. Estimates from 4 are more t 
we think they have more precision than the others. 


242 AVOIDANCE TRAINING cu. 11 


We use the symbol &, indiscriminately for all these estimates. The 
interpretation of the “hat”? on z, then is to mean “an estimate of" rather 
than “the estimate of." If we were to try to distinguish each of the 
estimates by a different symbol, we feel that the reader would be little 


further ahead, and perhaps rather confused. The source of each estimate 
should be clear from the context. 


METHOD |. By assumption, the probability of shock, gp, on trial 0 is 
unity, and so on trial 1 all dogs have the same probability of shock. 
From equation 11.5 we learn that this latter probability is 


(11.6) No = %2 

We may estimate 41,9 at once by the proportion of the 30 dogs that receive 
shock on trial 1. In Table 11.1 we see that 27 dogs are shocked on trial 1, 
and so as our first estimate of «, we obtain 


(11.7) ĉa = (27/30) = 0.900. 


Furthermore, we see that these 27 dogs have a probability of shock on 
trial 2 given by 


(11.8) 


{2,0 = «s? 


We can estimate qə by the proportion of the 27 dogs that receive shock 
ontrial2. From the data we get 2,9 = 24/27, and so as a second estimate 
of « we get 

(11.9) ĉa = V24[27 = 0.943. 


This estimate is not too close to the estimate given by equation 11.7 and 
so we would like lo obtain further estimates and average them. We 
could, with no difficulty, obtain further estimates of a» from 3,0 qa,» tC» 


and then combine them to obtain a grand estimate of %», but, instead, We 
consider two other procedures, 


METHOD 2. In Section 9.6 we described the statistic F, defined as the 
mean number of trials before the first A, occurrence (avoidance)- 
Equation 9.3 tells us that when Ay = 1 and qo = 1, as is the case in the 
present model, 3 


(11.10) F, = O(a, 1), 


where ®(a, 9) is the function given in Table A. The number of trials 
before the first avoidance of each dog is obtained directly from the data in 
Table 11.1. The mean is 4.50, and the standard deviation of the 30 
observations is 2.25. From Table A we see that (ay, 1) is 4.39 for 


| 
| 


SEC. 11.4 ESTIMATION OF THE SHOCK PARAMETER, 9 243 


dy = 0.92 and 5.08 for «, = 0.94. Therefore, by interpolation we get 
as an estimate of a, 


(11.11) &, = 0.923. 


We see that this estimate is between the two earlier estimates, 0.900 and 


0.943. 
In Section 8.3 we comp 
the first 4, occurrence. E 


(11.12) o2 = 2W (us, 1) + (s, 1) — [Pe DP, 


where W(a, B) is the function given in Table A. We may use the estimate 
&» just obtained, compute g? from the above formula, and compare the 
result with the variance of the trials to first avoidance obtained from 
Table 11.1. For æg 0.92 we get from Table A, ® = 4.39 and ¥ = 9.97. 
Equation 11.12 then gives o° = 5.06 and o= 2.25. The observed 
standard deviation is 2.25. Such close agreement is pleasant to find. 


METHOD 3. We have already obtained three estimates of a, but we 
now obtain another based upon the maximum likelihood procedure of 
Sections 9.9 and 9.10. We are concerned only with that portion of the 
data up through the first avoidance; hence k= 0 in equation 11.5, and 


we have 


MR) gozat k=0. 
9.40, and so the procedure 


uted the variance of the number of trials before 
quation 8.32 gives for this variance when qo = 1, 


This equation is of the form of equation 
developed in Section 9.9 is appropriate. The index » of that section 
becomes trial number n and so the quantities N, will become N,,, the num- 
ber of dogs on trial n with zero previous avoidances. Furthermore, 
x, = x, is the number of the N, dogs that avoid on trial n. The numbers 
N, and z, are readily tabulated from the data given in Table 11.1. We 
give the results in Table 11.2. 

, In Section 9.10 we presented three 
likelihood estimate. The third proce 


g the maximum 


methods of obtainin 1 
From equation 


dure we use first. 


9.74 we have 
(11.14) &gcz1— = 
È nN, 
n=0 

In Table 11.2, we then see that 

30 

11.1 aer off ser 0:038 

(11.15) Sal es 


244 AVOIDANCE TRAINING cH. 11 


TABLE 11.2 
Tabulations obtained from the data in Table 11.1. 


n Na Sy nN, n(N, — v,) N, 
0 30 0 0 0 0 
1 30 3 30 27 30 
2 27 3 54 48 108 
3 24 4 72 60 216 
4 20 7 80 52 320 
5 13 4 65 45 325 
6 9 2 54 42 324 
7 7 4 49 21 343 
8 3 2 24 8 192 
9. 1 0 9 9 81 
10 1 1 10 0 100 
1 0 0 0 0 0 
12 0 0 0 0 0 
Totals 30 447 312 2039 


METHOD 4. This approximate value of the maximum likelihood 
estimate can now be used to get a more exact value. We use the first 
procedure given in Section 9.10. The sum D,, defined by 


(11.16) D, = Y n(N, — x,), 
n=0 

has the value 312 as can be seen from Table 11.2. We must solve numeric- 

ally equation 9.66, which is in our present notation 


D 


(11.17) D, = X v.g (4j). 
n-Ü 

The function (3) can be obtained from Table C for given values of 2e: 

We proceed by trial-and-error as shown in Table 11.3. We first try 

& = 0.93, since we just obtained 2, — 0.933, 


e Using the values of 4» 
given in Table 11.2 and the values of 2,(£5) from Table C, we perform the 


multiplications and summation to get a sum of 350.5. This value is t00 
large since D; — 312 as we have seen, and so we try a, = 0.92. This 
value of 2, leads to a sum of 297.5, which is quite near D, zs 312. BY 
interpolation we get finally 


(11.18) &, — 0.923. 


This happens to agree to three places with the estimate given by equation 
11.11, which was obtained from the statistic F. 


SEC. 11.5 ESTIMATION OF THE AVOIDANCE PARAMETER, % 245 


TABLE 11.3 
Computations made for numerical solution of equation 11.17. 


n x, g,(0.93) 2,g,(0.93) §n(0.92) a, g, (0.92) 


n 


1 3 13.29 39.87 11.50 34.50 
2 3 12.80 38.40 11.02 33.06 
3 4 12.33 49.32 10.56 42,24 
4 7 11.88 83.16 10.10 70.70 
5 4 11.43 45.72 9.67 38.68 
6 2 11.00 22.00 9.24 18.48 
7 4 10.57 42.28 8.83 35.32 
8 2 10.16 20.32 8.43 16.86 
9 0 9.71 0 8.05 0 
10 1 9.38 9.38 7.68 7.68 
Totals 350.55 297.52 


Now that we have obtained the maximum likelihood estimate of as 
we estimate its variance by the method presented in Section 9.12. Since 
qo = 1, we can use the approximate formula 9.84, which becomes 


(11.19) (ê) 2 E —_ 


n=0 


Using the value & = 0.92 and the value of the sums in Tabl 
then get 


e 11.2, we 


(0.22)(0.08) — 0.00024, 


(11.20) dye dA LL 
9&3) = 335 — (0.08)2039 

and 

1121) o(&,) 5: 0.015. 


i Only the data up through the first avoidance of each dog has been used 
in obtaining estimates of a. Data beyond the first avoidance depend on 
%» but also depend on the avoidance parameter 2. Rather than try to 
use all the data to give simultaneous estimates of a, and &», We use the 
estimates of « already obtained and proceed to estimate % from the 
remainder of the data. In the next section we behave as if the true value 
of a» were 0.92. 


11.5 ESTIMATION OF THE AVOIDANCE PARAMETER, % 


Having estimated a», we now consider some estimates of the avoidance 
parameter g}. First we consider the statistic 75. defined in Section 9.6 
to be the mean total number of A, occurrences (shocks). Equation 9.20 


246 AVOIDANCE TRAINING CH. 11 


becomes for gy = 1, 


l — 
(11.22) Ties 5 


= Tlen a), 
G gi 

where 7(a», o4) is the function given in Table B. From the data we get 

T, = 7.80. Using the previously obtained value «, = 0.92, we obtain 

by interpolating in Table B, 


(11.23) &, — 0.807. 


We now develop another scheme for estimating %. First we consider 


only those data for which there is precisely one previous avoidance (k = 1). 
Equation 11.5 gives 


(11.24) Ina = qa, 


Since we have assumed that X5 = 0.92 we need only estimate q,, for a 
given value of n and then compute the estimate of a, from the last equation. 
An unbiased estimate of 1»1 is readily obtained from the data. We 
merely count up the number of dogs on trial n that have just one previous 
avoidance; we call this number N,,. Then we note how many of those 
N,, dogs avoided on trial n, and we call this number tanı In Table 11.4 


we present these tabulations. An unbiased estimate of q,1 is then 
1 — (,,4/N,,1), and we have from equation 11.24, 


{= Sal 
(11.25) B, es ; 
D egt 


For each trial n on which Nna is not zero we obtain an estimate of ^ 


from this equation, and so we denote the estimate for trial n by ĉn 
Wishing to combine the several estimates ĉin in a sensible way, we take 
a weighted mean: i 
x Wd n 
> j 


2W, ` 
n 


(11.26) & 


The problem is to determine the weights W, 
variance of the grand estimate & is minimized when the weights W, are 
inversely proportional to the variances of the ĉin provided that the Ĝin 
are uncorrelated; and they are uncorrelated, The variance of the 
estimate, 1 — (c, IN, 4), of q, is (from the binomial) 


It is well known that the 


(11.27) Taal — Gna) _ ee) 


ni nl 


SEC. 11.5 ESTIMATION OF THE AVOIDANCE PARAMETER, 9 247 


TABLE 11.4 


Tabulations obtained from Table 11.1 and used to obtain the estimates of a, 
given in Table 11.5. 


n |Nni 2na | Nae Sa2| Nas Ens | Nas Tna | as Was | Nao ao 


1 


v0 00 &UvI—oO 
PROF RF ONnNRWAUUNH m | 
OcOococo-o--—-—ro-oówunu-ooo 
le OF KK WRUDUWOO! 


Y 

1 

Pile ltl eennuweauws 
[ee eee AQuUuarnans 


OcoOo-—-Oo0-—-—mquuwwvcocouwuoooooo 
SOK HKHNNNAIBDARWOOCOCOCOSCSO 
SCRENNWIDDARUNDOCDCODCOCOS 
NEKwWKBADUMURDA-COCCCOCOSCCS 


| ——tbt—tUttt3tU-—I! 
NOWOAUARARO! 


! 
! 
| 
1 


SSoSSCOOF KH NWNADUNTCOCMUWOO 


| 


40 30 


> 
p 
eo 


Sums |59 30 |s0 30 |38 30 |40 30 


— 


tained by dividing the estimate of 


In equation 11.25 we see that 4,, is ob iren 


9n1 by «,"-1, Therefore, the variance of à; , is obtained by 
variance of the estimate of qn 1 by the square of % "1, that is, 
n—1l 
ES a(l — aaa") 
(11.28) d = ay 
weights 
The common factor, «,, does not matter, and so we can take the weig 
to be 
æa" Naa 
(11.29) m= eas 
i i tor is 
We then substitute these weights in ur 11.26. The numera 


Naa — na 
(11.30) 2 Wh. 2 ze 


248 AVOIDANCE TRAINING CH. 11 


and the denominator is 


eu INi 
(11.31) W, = mu 
1 — aqxg"- 


Both of the last two sums contain a, the parameter being estimated, and 


So we replace a by 4, in the last two equations. We then get from 
equation 11.26 


nx," IN, NL s 
(11.32) > a m » na — ail 
1 — Bye" 1 — ĉj"! 


This simplifies to 


x 
11:33 = — 
(11.33) > EA > Ma 
n 


n 


This equation must be solved numerically to obtain &,. 


The process is 
somewhat laborious, but not excessively so, since we have al 


| ready obtained 
some estimates of « and know the range it is in. The right side of 


equation 11.33 has the value 59 from the tabulations in Table 11.4. We 
have computed the left side for 2, = 0.73 and 4, = 0.74 and obtained 
58.83 and 59.67, respectively, Interpolating, we then get 


(11.34) & = 0.732, 


as the solution of equation 11.33 for the d 


à ata in Table 11.4. 
The variance of the last estim 


ate of z, may be obtained from the relation 


ZWG n) 
(11.35) o%(4,) = 2 
[> Ww. * 


though this is probably an underestimate because we have assumed æg 
to be known. Furthermore, the estimates of , and g, are surely correl- 
ated. The weights are given by equation 11.29 and the variances 62(34,n) 
by equation 11.28. Some algebra leads to the result l 


(11.36) 9*0) = x NW, 


n 
When we replace a, 
equation 11.29, and s 
the estimate 


by its estimate 0.732, compute the weights from 
ubstitute these in the preceding equation, we have 


(11.37) a(ĉ,) ~ 0.095. 


We use this result shortly. 


SEG. 11:5 ESTIMATION OF THE AVOIDANCE PARAMETER, 4, 249 


The procedure just described for obtaining an unbiased estimate of a 
may be extended to obtain unbiased estimates of o" for k= 2, 3, +++. 
An unbiased estimate of q, of equation 11.5 is 1 — (z, /N, j), where 
N, xis the number of dogs on trial n with precisely k previous avoidances, 
and z, , is the number of those dogs that avoid on trial n. The develop- 
ment follows that given above for k — 1, and so we omit the details. 
Analogous to equation 11.33 we get 


(11.38) — > New 
lues ag a" 7 = * 


n 
and these must be solved for the estimates a. 
a," is unbiased, its Ath root, the corresponding estimate of œ, will be 
somewhat biased for k > 1. We have computed these estimates of a 
for k = 2, 3, 4, 5, and 6, and they are given in Table 11.5. When we 
take a simple average of the six estimates of x we get finally 


(11.39) &, = 0.797. 


Although the estimate of 


TABLE 11.5 


Estimates of the avoidance parameter, ^, 
made from equation 11.38. 


k P 8 
ROME TAURI I UT 

1 0.732 0.732 
2 0.642 0.801 
3 0.350 0.705 
4 0.424 0.807 
5 0.468 0.859 
6 0.449 0.875 


BENE eS 


The variance of this grand estimate of «, may be computed—or at least 
ve use only a rather rough 


estimated—in several ways. However, v ) r 
estimate. Equation 11.37 gives the standard deviation of tue estimate of 


^; obtained for k= 1. We assume that the corresponding standard 
deviation for k — 2, 3, 4, 5, 6 is about this same value, and so we merely 
divide 0.095 by 4/6 since we have averaged six estimates. We then get 
(11.40) (54) ~ 0.04, 

as the standard deviation of the estimate given by equation 11.39. The 
estimate of 0.797 of equation 11.39 compares very well with our earlier 
estimate of 0.807 given by equation 11.23. For further discussion we 
use the estimate 0.80. 


250 AVOIDANCE TRAINING CH. 11 


08 p 
2 Solomon-Wynne data 
E 
5 
Pa 
3 
M 0.6 
g 
S 
H 
2 
2 04 


0.2 


4 6 8 10 


12 14 16 18 20 22 24 26 
Trials, n 


oided on each trial in the Solomon-Wynne ue. 

ment (circles joined by solid lines), Proportion of the stat-dogs that avoided on eac 

trial (circles joined by dashed lines), and the theoretical means, Vin, computed from 
€quation 11.41 (smooth curve). 


Fig. 11.1. Proportion of dogs that ay 


Mean cumulative number shocks 


0 2 4 6 8 10 12 14 16 18 20 22 24 26 

Trials, n ane 
Fig. 11.2. Mean cumulative number of shocks received by the dogs in the m al 
Wynne experiment (circles) and the cumulative curve computed with the al i 


equation 11.4], u 


SEC. 11.6 GOODNESS-OF-FIT 251 


11.0 GOODNESS-OF-FIT 


In the preceding two sections we computed several estimates of the 
parameters «sand a}. The values we now use are x, = 0.92 anda, — 0.80. 
We use these values to compute some theoretical properties of the data 
and compare them with the observed properties. We are interested in 
the proportion of dogs that avoid on each trial; this proportion is com- 
pared to the theoretical means V;,,, discussed in Section 8.2 and elsewhere. 

The approximate explicit formula for the means is given by equation 
8.8. For A= 1, py = 0, and the above values of «, and a», this equation 


becomes 
0.6793 — 1 


1 
(11.41) naas (033 + 1.67 c eal) 
equation 


For n= 0, 1, +++, 25 we have computed the right side of this 
from the 


and show the results in Fig. 11.1, along with the proportions 
data given in Table 11.1. 

Another way to present the data is to plot the mean cumulative number 
of shocks versus trial number. In Fig. 11.2 we show such a plot for the 
data in Table 11.1. The theoretical curve was obtained by integrating 
(1 — V; n) over n, using equation 11.41 to give Vin- 1 

For further comparisons we have made thirty Monte Carlo computations 
(called “stat-dogs”’), using the operators 

Q, p = 0.80p + 0.20, 


Qa p = 0.92p + 0.08, 


and py=0. In Table 11.6 we show the resulting “data,” and in Table 
11.7 we compare several statistics of these data with the corresponding 
Statistics of the Solomon-Wynne data. The close agreement between 
these two sets of statistics is evidence that the model gives a reasonably 
accurate description of the data. The reader may find it instructive to 
compare the dogs and stat-dogs not only by eye but also numerically on 


the basis of any statistic that occurs to him. t 
Now that we have obtained some numerical computations from the 
model we measure the goodness-of-fit in more formal ways. First, we 
make a run test, as described in Section 9.13. We note whether T 
computed V, of Fig. 11.1 is above (+) Or below © the observe 
Proportion of avoidances on each trial. We see In Fig. 11.1 that there 
are eight pluses and seventeen minuses; the number of runs is d= 13. 
The critical value for too few runs is 7, and the critical value for too many 
Tuns is 16, both at the 5 percent level. Hence, we conclude that the agree- 
ment between the model and the data is satisfactory as far as the run test 


1S concerned. 


(11.42) 


252 AVOIDANCE TRAINING CH. 11 
TABLE 11.6 


Record of "shocks" (occurrences of Ay) of 30 stat-dogs for 25 trials each, 
computed with the operators of equations 11.42. 1 


Trial n 
Dog 
90|1|2|3|3/5|6]|7]|8]|9 |10] i1 |12 13 | 1| 15 | 16 | 17 18 | 19| 20| 21 122] 23 | 24 
t is] s [ists] mss 
2 (SIShS| S/S S| SS s s 
3 lisis K s|s S|s|s is sy Ss 
4|s|s|s|s|s s S| Ss 
5 /S(Si Sl Sl ssl s| Ss] s s s 
s ieiz S S 
7 SNE E s E 
8 |s|s|s s s 
9 |s|s|s s|s|s Sis s 
10 |s|s|s m s|s E s 
HP Ss [es s s s E | ug 
12 |s|s|s s s|s 
B |s|s|s|s|s|s s|s 
14 [ls 5| S s 
15 (15 elis s s = 
16 |s|s|s s A IE 
17 |s|s,s|s s|sisis SS s 
18 | S| S| S | S S E) 
19 |S s s s 
20 |s|s|sis Ss 
WARFARE H im s SI EEG S 
2 |s|s E s 
22 |s|S| s|S|s| s s 
24 | sls Sissies ss [us s 
25 |$|S|s|s|s E s s 
2 | S| S| S| S| S, S| 5| $ [ap agi [s 
2 |s|s|s|s|s s|s s|3 ji E 
28 |s E s|s $ [s 
2 |s|s|s|s|s|s S R s 
30 | Ss} s|s S s|s s s — 
aou ato) ee Es le vla lala lao] m| 510 
[i de 


TABLE 11.7 
Comparisons of the Stat-dog “data” in Table 11.6 and the Solomon-Wynne dat 

in Table 11.1. 

Stat-Dogs Real Dogs 
Mean S.D. Mean S.D 

Trials before first avoidance 4.13 2.08 4.50 2.25 | 
Trials before second avoidance 6.20 2.06 6.47 2.62 
Total shocks 760 — 227 780 — 252 
Trial of last shock 12.53 4.78 11.33 4.36 
Alternations 5.87 211 5.47 2.72 
Longest run of shocks 4.33 1.89 4.73 2.03 


Trials before first run of four avoidances 9.47 3.48 9.70 4.14 


SEG: TL A THEORETICAL INTERPRETATION 253 


atistic U defined in Section 9.13. We need an 


Next we compute the st 
From equation 8.7 with 


expression for the second raw moment Vas 

A= | we get 

(Visa — Van) — (1 — t(l — Van) , V. 
la 


(11.43) V, 


2,n 
a — X» 


Thus the statistic U of equation 9.92 is 

. — NV, 
(11.44) U Q3 — | > (En — NVin) . 
N (Vini = Vin) — (1 — «(1 — Vin) 
n 


Using the values of V4,, computed from equation 11.41 and the values 
dy = 0.92, a = 0.80, and N= 30, we obtained U= 18.6. Assuming 
twentystwo degrees of freedom (twenty-five trials less three parameters 
estimated) we obtain P = 0.67 from a 7 table. We again conclude that 
the fit is satisfactory. 

The assumed commutativity of the operators implie 


probability after say two avoidances and three shock: d 
the order in which those avoidances and shocks occurred. This in turn 


implies, among other things, that the number of additional trials required 
for say the fifth avoidance is independent of that order. It was suggested 
to us that this prediction can be tested, in principle, by examining the data, 
but it turns out that the number of observations available for such a test 
is small, even with data on thirty dogs. 

That we succeed so well in fitting the model to the data is undoubtedly 
à result, in part, of the great care taken by the experimenters in controlling 
conditions. Considerable effort was made to control the strength of 
Shock and the height of the barrier relative to each dog, to eliminate 
disturbing influences during the course of the experiment, etc. This 
analysis of the data shows that there is little evidence for dog-to-dog 
differences, and this is testimony to the carefully controlled conditions 1n 


the experiment. 


s that the avoidance 
s is independent of 


11.7 A THEORETICAL INTERPRETATION 


In their theoretical paper, Solomon and Wynne discuss àv 
They conceive that two 


training in terms of a dual theory of learning. nc : 
effects take place: (1) By classical conditioning, the conditioned stimulus 
(CS) acquires the essential property of the unconditioned stimulus (US), 
namely, ability to evoke the withdrawal response; and (2) by instrumental 
conditioning, the probability of the response with the CS becomes greater. 
Cessation of anxiety is considered to be the reinforcing agent, but sas i 
is produced by the CS only after the classical conditioning has occurred. 


oidance 


254 AVOIDANCE TRAINING cH. 11 


In the remarks which follow we attempt to reconcile this theoretical 
position of Solomon and Wynne with the model discussed in the preceding 
sections. 

In Chapter 2 we presented a set-theoretic description of instrumental 
conditioning and demonstrated that our linear operators could be deduced 
from some simple assumptions about stimulus sampling and conditioning 
(as proposed by W. K. Estes). Hence, we have already provided an 
interpretation of our operator Q, in terms of instrumental conditioning; 


S S' 


Fig. 11.3. Two sets of stimuli, S and S’, which have an overlap or intersection /. 

The set $ denotes the experimental situation with the conditioned stimulus (CS, 

light out and gate up), and the set S" represents the situation with the unconditioned 

stimulus (US, shock). The intersection / represents those stimuli common to the 
two situations. 


in a given situation (experimental box with the CS), a response (jumping) 
has a probability p of occurrence and, when it occurs, the probability 
changes to Q, p. 

Next, we interpret our operator Q, in terms of the classical conditioning 
effect described by Solomon and Wynne. We denote the experimental 
situation with the CS by a set S and the situation with the US by a set S. 
These sets have an intersection / as shown in Fig. 11.3. We define an 
index »j by d 
(11.45) BA) 

MS) 
where .//( ) denotes the measure of the set or subset named in the 
parentheses. The elements contained in S' are assumed to be completely 
conditioned to the jumping response, that is, the response has probability 
one of occurring on a trial on which the US appears. We assume that 


the similarity of S to S’ increases each time the CS is followed by the Us. 
In particular, we assume that the new index is 


(11.46) Qyi = agi + (1 — ag). 


Now if none of the elements contained in the complement of J in S ate 
conditioned to the jumping response, the probability p of jumping in 


SEC. 11.7 A THEORETICAL INTERPRETATION 255 


(avoidance) equals the index 3j. Therefore, under these conditions, 
equation 11.46 leads at once to our basic equation for Qə p (equation 1 1.2). 

There remains to be discussed how the two effects can be combined to 
lead to the same two operators Q, and Qa As a result of assumptions 
already made, Q, will have the form given in equations 11.2 even though 
- subset I of S is completely conditioned to jumping. Thus, for any 
similarity n, the first of equations 11.2 describes the effect of an avoidance 
on the probability of avoidance. However, after one or more avoidances 


S'(after) 7 S' (before) 


a trial on which shock occurs. 


Fig. 11.4, The situations S and S” before and after 
it is assumed that all 


The shaded area indicates stimuli conditioned to jumping: 
elements in S' (situation with the US) are conditioned to jumping and that a shock 
trial increases the measure of the intersection of S and S’. 


ping in the manner 
in S is conditioned. 
the dotted 


have occurred we have elements conditioned to jum 


opn in Fig. 11.4: a part of the complement of / 
i ow when the US follows the CS, the new S' is as shown by 
ine in Fig. 11.4. The index increases by an amount 


(11.47) An = (1 — 29 — 9 

The probability q of not avoiding is 

(11.48) -—— 
MS)" 


where C is the non-conditioned part of S. This probability decreases by 


àn amount 
MC) 
(11.49) Aq= Gon M An. 
Using equations 11.45, 11.47, and 11.48 in equation 11.49 gives 
(11.50) Aq=(1—%2)4- 
Hence the new probability of not avoiding is 
(11.51) Og = 9 — Ag = "t 


in equations 11.3. 


This is the assumed operator shown 
basic operators of e 


We have just shown how our two quations 11.2 and 


256 AVOIDANCE TRAINING cH. 11 


11.3 can be deduced from a set-theoretic description of the two effects 
described by Solomon and Wynne. The instrumental part follows from 
the simple conditioning mechanism treated earlier, and the classical part 
involves an increasing index of similarity of the situation with the CS 
to the situation with the US. We do not consider this interpretation of 
the model to be essential or unique. Rather, we have indicated once 


again that the model can be interpreted in terms of current learning 
theories or principles. 


11.8 EXPERIMENTS ON THE CS-US INTERVAL 

Although we have estimated parameters from the d 
in reproducing those data with the model, we might 
helps in interpreting the experimental results, The model does provide a 
measure of the relative effects of shock and avoidance trials. We found 
that 2, = 0.92 and & = 0.80; since (0.92)?7 — 0.80, we can say that an 
avoidance trial is worth about three shock trials, that is, three successive 
shocks have the same effect on the probability of avoidance as a single 
avoidance trial. Furthermore, our analysis of the data suggests but 
little evidence for dog-to-dog differences. These conclusions seem to us 
to be useful ones and conclusions that would not have been reached 
without a model. . However, à major value of the model is that it provides 
two summary statistics of the data which may be used in comparing dogs 
run under different experimental Conditions. We now describe a series 


of experiments conducted by Solomon and his colleagues and present the 
results of our analysis of the data. 7 


In the experiment described in Section 11.2 
the conditioned stimulus (CS, light out and gate raised) and the uncon- 
ditioned stimulus (US, shock turned on) was 10 seconds. In more recent 
studies, Brush, Brush, and Solomon [5] have run five additional groups 
-of dogs with CS-US intervals of 2.5, 5, 20, 40, and 80 seconds. “There 
were eleven dogs in each group, and so our estimates of the shock and 
avoidance parameters x, and % for those groups are not as reliable as 
the estimates for the 10-second dogs. In Table 11.8 we show the mean 
number of trials, F}, before the first avoidance and the mean number, 72 
of total shocks received per dog for each group. From these two statistics, 


we estimated z, and a» by using equations 11.10 and 11.22. The results 
are also given in Table 11.8, 


The estimates of the shock parameter x 
nearly the same effect with CS-US intervals of 2.5, 5, 10, and 20 seconds: 
(2) that it has a smaller effect with an interval of 40 seconds; and (3) that 


it has an appreciably larger effect when the interval is 80 seconds. The 
estimates of the avoidance parameter 0 


ata and succeeded 
ask how the model 


, the time interval between 


2 indicate (1) that a shock has 


show that an avoidance trial has 


SEC. 11.9 SUMMARY 257 


TABLE 11.8 


Data obtained from the experiment by Brush, Brush, and 
of the CS-US interval in avoidance training of dogs. 
trials before the first avoidance is denoted by F, and the mean number of total 
The estimates of the avoidance parameter a 
last two columns. 


Solomon on the effect 
The mean number of 


shocks per dog is denoted by Ti. 
and the shock parameter «s are given in the 


CS-US Number E 7 > n 
Interval of Dogs Hh Ty i: Oy 

2.5 11 4.45 7.64 0.80 0.92 

5 11 4.64 7.18 0.76 0.93 

10 30 4.50. 7.80 0.81 0.92 

20 11 5.00 9.09 0.82 0.94 

40 11 7.82 11.45 0.79 0.97 

80 11 _ 3.27 10.18 0.94 0.86 


about the same effect when the CS-US interval is increased from 2.5 to 
40 seconds, but that the effect decreases at 80 seconds. Neither estimated 
parameter, &, nor ĉo, is a monotonic function of the CS-US interval, and 
this leads us to suspect that more data are needed to give a complete 
Picture of the effect of changing the interval. The variances of the 
estimates 4, and &, are appreciable, even for the thirty dogs in the 10- 
Second group as we saw in Sections 11.4 and 11.5. This may be seen 
most easily from the variances of the estimates F, and T, from the data. 
We saw in Section 11.4 that the standard deviation of the observed 
number of trials before the first avoidance was 2.25. For thirty dogs, 
then, F, has a standard deviation of about 0.41. The standard deviation 
of the estimated mean number of total shocks, T, is about 0.46. These 
estimated standard deviations are for the thirty dogs in the 10-second 
group, and the corresponding estimates for the other groups would be 


appreciably larger because the groups are smaller. 


11.9 SUMMARY 

A recent experiment of Solomon and Wynne [3] on the avoidance 
training of dogs is analyzed in terms of a model with Qip = %P + (1a) 
and Q,p = asp + (1 — %2)- Alternative A; i identified with avoidance 
and alternative A, is identified with non-avoidance (shock). The initial 
avoidance probability is taken to be zero. We estimate the avoidance 
Parameter x, and the shock parameter 2a from the data and obtain 
& = 0.80 and 4,— 0.92. The computations made from the model in 
terms of those two parameters are in close agreement with the data. 
A theoretical interpretation of the experiment is described in Section 113, 
and data on the effects of the CS-US interval are discussed in Section 11.8. 


258 AVOIDANCE TRAINING CH. 11 


REFERENCES 

l. Hilgard, E. R., and Marquis, D. G. Conditioning and learning. New York: 
D. Appleton-Century, 1940, pp. 58-62. 

2. Keller, F. S., and Schoenfeld, W. N. Princi 
Appleton-Century-Crofts, 1950, pp. 311-315. 

3. Solomon, R. L., and Wynne, L. C. Traumatic avoidance learning: acquisition in 
normal dogs. Psychol. Monogr., 1953, 67, No. 4 (Whole No. 354). 

4. Miller, N.E. Learnable drives and rewards. Handbook of exp 
S. S. Stevens, ed., New York: Wiley, 1951, pp- 435-472. 

5. Brush, F. R., Brush, E. S., and Solomo; 
effects of CS-US interval with a delay 
Psychol., 1955, 48, in press. 


iples of psychology. New York: 


erimental ps yycholog "ys 


n, R.L. Traumatic avoidance learning: the 
ed-conditioning procedure, J. comp. physiol. 


CHAPTER 12 


An Experiment on Imitation 


12.1 THE EXPERIMENT 


Imitation has been considered a basic process in social behavior, and it 
has been shown experimentally that imitation can be learned, that is, that 
it can be controlled by rewards and punishments. Miller and Dollard 
have given a detailed analysis of imitative behavior in terms of reinforce- 
ment theory, and they describe a number of experiments on rats and on 
children [1]. These experiments demonstrate that one subject will learn 
lo imitate another subject when it is rewarded for doing so, and that this 
Imitative behavior generalizes to other stimulus situations. 

Schein investigated imitation in small groups of army inductees in 
Problem-solving situations [2], but his data show little evidence that 
Imitation can be controlled by rewards and punishments. Later Shwartz 
conducted a simple imitation experiment (suggested by Schein's work) on 
grade school children as well as high school students [3]. Some of the 
data from her experiment are analyzed in this chapter. 

In Shwartz’s experiment, two children were brought into a room and 
told that they were to participate in a guessing game. Each child was 
to guess whether the experimenter was going to say “a” or “b” on each of 
fifty trials; first, child 1 said “a” or ** ," then child 2 said “a” or **b," 
and then the experimenter said *a" or *b." During each block of 10 
trials, the experimenter said what child 1 said 8 times. A schedule was 

rawn up in advance by randomizing within blocks of 10 trials. The same 
Schedule was used on all pairs of subjects in the experimental groups. 
According to the design, child 1 was correct 8 out of 10 times, but child 2 
had the option of “imitating” child 1 or not; if he did imitate he would be 
Tight 8 out of 10 times. Of course the subjects were not told that child 1 
Was right on 80 percent of the trials; they were told that they were used 
together to save time and "somebody had to be first.” 

Four main groups of subjects were used in Shwartz’s experiment, with 
twenty pairs of subjects in each group. There were two experimental 

259 


260 AN EXPERIMENT ON IMITATION cH. 12 


groups and two control groups; two age levels were used—fourth grade 
students and high school students. The control groups received the same 
treatment as the experimental groups except that child 1 was "rewarded" 
on only 5 out of 10 trials. The main results found were that the experi- 
mental grade school children imitated significantly more than their 
controls whereas there was no significant difference between the experi- 
mental and control groups at the high school level. Furthermore, both 
control groups imitated less than half the time. 

In the next section we describe a model for analyzing Shwartz's data. 
In later sections we estimate the model parameters from the data, and 
finally we discuss the goodness-of-fit of the model. 


12.3 THE MODEL 

The behavior of child 2 is of major interest; he says “a” or “b” and is 
right or wrong, depending on whether or not he imitated child 1. In 
constructing the model for this experiment we could identify the responses 
“a” and “b” of child 2 with alternatives A, and A, in the mathematical 
system, but, since the purpose is to study the imitation behavior of child 2, 
we say that alternative 4, occurred if child 2 made the same response as 
child 1 and that alternative A, occurred if child 2 made the opposite 
response. In other words, the response patterns aa and bb are identified 
with A, and the patterns ab and ba with Ay. As always, these identi- 
fications are not unique. 

If it should happen that the first child always said “a” after a few trials, 
it would be difficult to argue that the second child was learning to imitate; 
we could just as well argue that the second child was learning to say “a” 
independent of the first child. However, in Shwartz's experimental 
grade school group, the number of **a" responses made by the first child 
averaged 26.25 for the 50 trials, and ranged from 23 to 33. Thus we infer 
that child 2 is not just learning to say “a” or “b.” 

The events £F, and E, in the mathematical system are identified with the 
announcements of the experimenter following the guesses of the subjects. 
When the experimenter confirms the guess of child 1, event E, occurs. 
When the experimenter denies the guess of child 1, event E, occurs. If 
we choose, we may consider that E, rewards child 2 for imitating or 
punishes him for not imitating, depending on what child 2 actually did. 
We consider this a single event, independent of child 2's behavior. Simi- 
larly we may regard E, as reward of non-imitation or punishment. of 
imitation. With these identifications, the events are experimenter 
controlled. Event E, occurs on a specified 8 trials in each block of 10. 
Hence, if p, is the probability that child 2 imitates on trial n, the order of 


application of operators Q, and Q, to p, is completely specified by the 


SEC. 12.2 THE MODEL 261 


experimenter's schedule and is independent of the behavior of the subjects. 
Furthermore, if we assume that the subjects are "identical," that is, have 
the same initial probability of imitation and the same parameter values, 
then all subjects have the same probability p, on trial 1. We do not 
obtain a distribution of probabilities as we did in the applications presented 
in earlier chapters. This fact makes the analysis especially simple. 

The operator Q,, applied to p, when event £, occurs, is assumed to be 
given by 


(12.1) Qi, = ^p, + (I! — 04), 


that is, 4, — 1. When the experimenter confirms what child 1 said on 
trial n Wwe apply Q, to p, the probability that child 2 made the same res- 
ponse as child 1 on trial n. We take the operator Qe to be 


(12.2) Qsp, = pus 
that is, 2, — 0. This operator is applied to p, when the experimenter 
denies child 1’s response. Since we know the precise sequence of events 
used in the experiment we can compute the probabilities p, on every trial 
in terms of the initial probability po and the parameters %, and xs. 
later sections we estimate fo. i» and a; from Shwartz's data. 

An alternative model, but a more complex one, can be obtained as 
follows. We could still take the position that child 2 imitates child 1 
(response 4,) or does not imitate child 1 (response Ay) on every trial. 
We could then say that there are two possible outcomes (not events): 
Outcome O, would be confirmation, by the experimenter, of child I's 
response, and outcome O, would be denial of that response. As a result, 
We would have four events listed below. 


In 


Child 2's Response Outcome Event 
A, (imitation of child 1) O, (confirmation of child 1’s response) Eg 
A, (imitation of child 1) O, (denial of child I's response) Exo 
Ay (non-imitation of child 1) O, (confirmation of child l's response) En 


^» (non-imitation of child 1) O, (denial of child I’s response) 
For the four events Ej, we would then have operators Qj; defined by 
Qrp = «p + (L — eae 


Reasonable restrictions on the parameters can be made: (1) that 4 = 1 
and 2,, = 0, that is, that the probability p of imitations tends to unity or 
zero, depending on whether or not it is rewarded when it occurs, (2) that 


262 AN EXPERIMENT ON IMITATION CH. 12 


Ao; = 1 and A5,— 0 by making the analogous assumption about non- 
imitation. These restrictions give the operators, 


Qu p = & p + (1 — an), 


Qio p = %2 p, 
Qn p = a p + (1 — 23), 
Qoo p = do p. 


The experimental design calls for confirmation of child l's response a 
fixed proportion z of the trials, independent of the responses made by 
the two children. Thus we get the following event probabilities. 


Probability of 


Event Occurrence 
Ey pz 
Ep p — =) 
Ey (1— p» 
E» a — px sio) 


We have set up a model which is an application of the case of experi- 
menter-subject-controlled events described in Sections 3.13 and 4.6, 
Further assumptions about the four parameters o, can be made to partic- 
ularize the model. One such set of assumptions leads to the model 
previously described, namely, we let Oy) = Oo) = 0, 
In effect these assumptions mean that confirmatio 
non-imitation has the same effect on child 2, and a similar statement 
holds for denial. In other words, outcome O, has the same effect on 
p whether it follows A, or Ag, and, similarly, outcome O, has the same 
effect after either response. 

We wish to emphasize that other restrictions on the “jx are possible; 
we might prefer to make no further restrictions and es 
parameters from the data. Moreover, 
the model alternatives and events coul 
models. 


and a4, = aas = Xo. 
n after imitation or 


timate all four 
still other identifications between 
d be made, leading to still other 


12.3 THE DATA 
We analyze in detail only the data obtained b 
experimental grade school group. We re 
variables æ, „ defined by 


in 


y Shwartz from the 
Present the data by random 


1 if child 2 of the ith pair (i= 1,3. «2. ,N) 
(2S) t= imitates child 1 of that pair on trial n, 
i 0 otherwise. 


SEC. 12.3 THE DATA 263 


The number of imitations occurring on trial n is 
N 
(12.4) By = D Rin 
isl 
Table 12.1 gives the values of x, obtained by Shwartz for the experimental 
grade school group on each of fifty trials. For that group, N = 20. 


TABLE 12.1 a 


The number of imitations x, made on each of 50 trials for the grade school grou 
Studied by Shwartz. There were 20 pairs of subjects. The asterisk (*) beside 
the trial number indicates that event E, occurred on that trial. 


n V n x, n Va 
0 7 17 6 34 9 
ac. 10 *18 [3 35 10 
2 7 19 9 36 13 
3 8 20 12 37 14 
4 $ 21 11 38 14 
5 9 22 15 *39 15 
*6 10 *23 13 *40 9 
7 8 24 9 4l 8 
8 7 *25 14 42 11 
9 9 26 8 43 12 
10 13 27 10 44 12 
11 12 28 12 45 15 
12 15 29 13 46 11 
13 l1 30 13 *47 17 
14 17 *31 15 48 8 
ES 13 32 9 49 13 
16 15 33 12 
E 1$ convenient to introduce another set of random variables y, defined 
1 if experimenter said what 
(12.5) UNE child 1 said on trial n, 
0 otherwise. 
In other Words, y, — 1 if event E, occurred, and y, = 0 if event E, 
occurred. In Table 12.1 asterisks (*) indicate trials on which event E; 
occurred. 


Ina later Section we inquire about the homogeneity of the subjects, 
and so in Table 12.2 we give the number of imitations made by each of 
the 20 Subjects for the first 25 trials and the last 25 trials. 


264 AN EXPERIMENT ON IMITATION CH. 12 


TABLE 12.2 


The number of imitations made by each subject during the first and last 25 trials 
of Shwartz's imitation experiment. 


Subject Trials 0-24 Trials 25-49 Total 


1 14 21 35 
2 21 24 45 
3 11 15 26 
4 17 16 33 
5 19 13 32 
6 15 6 21 
7 13 15 28 
8 12 20 32 
9 12 20 32 
10 10 17 27 
11 15 18 33 
12 9 12 21 
13 9 11 20 
14 8 8 16 
15 19 23 42 
16 8 9 17 
17 11 8 19 
18 10 8 18 
19 16 11 27 
20 17 22 39 

Total 266 297 563 

Mean 13.30 14.85 28.15 


12.4 ESTIMATION OF a, AND a, 


The simplicity of the experimental design and the model chosen leads 
to a direct procedure for estimating the parameters Oy 


and z,. First 
consider x. When event E, occurs on trial n, 


(12.6) Pu 7*3 p, 
For any trial n we can estimate p, by the pro 


portion of subjects that 
imitated on trial n, and we can estimate p 


n+1 by the Corresponding pro- 
portion on trial n + 1. Hence we obtain estimates of % Without further 


ado. The estimate of p, is x,/N, and the estimate of Puy i8 mul N. 
Thus we can write 


(12.7) 


Tia S gts. 


SEC. 12.4 ESTIMATION OF & AND 4x5 265 


We want to combine these estimates for all trials for which event E, 
occurs, that is, for which y, of equation 12.5 is zero. Therefore we have 
(12.8) SC = ya mes X — yrs. 

n n 
This leads to the estimate 


Sd — vta 
(12.9) i ae 


In words, the estimate is obtained by counting the total number of 
imitations on trials immediately after E, occurs and dividing that number 
by the total number of imitations on trials on which E, occurs. From the 
data given in Table 12.1 we get 


(12.10) ĉa = 81/133 = 0.609. 


The procedure for estimating æ is much the same as that for a. We 
Obtain from equation 12.1 and the relation q, = 1 — pp, 


(12.11) 


Quai = An: 


This equation is appropriate when event £, occurs on trial n, that is, when 


Y= l. For such trials, q, is estimated by 1 — (x,/N) and q,,1is estimated 
by1— (v,,,,/N). Therefore, analogous to equation 12.9 is the equation 
> HCN — 8,4) 
(12.12) 5n 
nU Sy x) 
n 
The d 


ata from Table 12.1 give 


(12.13) &, — 305/363 — 0.840. 


Note that 4 
Which incre 
imitation, 


1 is considerably larger than 2,. This implies that events 
ase imitation have a smaller effect than events which inhibit 
We might argue that this kind of copying is not approved in 
our society and hence the observed effect. 

he same estimation procedures were also used on Shwartz’s experi- 
mental high school group, and the values 2, — 0.925 and 4, = 0.982 
Were obtained, These parameter estimates are much nearer unity than 
the Corresponding ones for the grade school group. This result agrees 
with Shwartz’s finding that the grade school students showed much more 


Teaction to the rewards and punishments for imitation than did the high 
School students. 


266 AN EXPERIMENT ON IMITATION cu. 12 


12.5 ESTIMATION OF p, 
In the last section we estimated the two parameters ø} and «s, but we 
made no reference to the initial probability of imitation, Po. We now 
consider the problem of estimating p from the data. 


The obvious estimate of py is the proportion, x9/N, of imitations ob- 
served on trial zero. Table 12.1 gives 


(12.14) Bo = (%o/N) = (7/20) = 0.350. 
The variance is 
(12.15) oA) = BL Po) 


and may be estimated by replacing Po With fo; the standard deviation is 
then 


(12.16) a( Ĥo) = 0.107. 


The variation in fy is large, and so we want to obtain 
Table 12.1 shows that event E, 


(12.17) 


a better estimate of po. 
occurred on trial 0; equation 12.1 gives 


P1 — 93po + (1 — &). 


We have already estimated %, and we can estimate n 


by the proportion 
2,/N, and from this we obtain another estimate of p 


o The data give 
(12.18) 0.500  0.840p, + 0.160. 


Solving for po, we get the estimate 


(12.19) Bo = 0.405. 

We could continue in this way and get an estimate of Po for every trial, 
but the problem remaining is how to combine the estimates. The obvious 
procedure is to take a weighted mean of the individual trial estimates. 
We denote the weights by W, and the individual estimates 


by Pon The. 
grand estimate is then 
Z WaBon 
12.20 ILES 
(12.20) = * 


The only question is how to determine the weights W,. To do this we 
will appeal to a 7? criterion. 


We would like to choose p, so that we get the best possible fit to the 


SEC. 12.5 ESTIMATION OF py 267 


data. The goodness-of-fit may be measured by the Pearson x. "The 
expected number of imitations on trial n is Np, and the expected number 
of non-imitations is Nq,; the observed numbers are x, and N—x,, 
Tespectively. Therefore, the Pearson ae 


(=, — Np, , (N-— 2%, — ze» 
12. p= i 3 
(12.21) % 2| Np, NG, 


Using the relation qn = 1 — p,, we have after simplifications 
(x, — Np, 


(12.22) Be 


The smaller the value of this £^ the better the fit to the data. The problem 
is to choose p, so that # is minimized. Since p, is a function of py we 
need to differentiate 7? with respect to p, and set the derivative equal to 
zero. But unfortunately p, appears in both the denominator and the 
numerator of each term in the sum; this complicates greatly the resulting 
equations. 

To avoid the complications just mentioned we shall use Neyman’s 
Modification of the Pearson 7? criterion. This modification, which might 
be called the “observed 3^" is an application of the “best asymptotically 
normal" estimate developed by Neyman [4]. It is obtained simply by 


replacing the expected quantity in the denominator of the Pearson 7? by 
the observed quantity: 


M(x, — Np, 
(12.23) a 2 (v. — Np. 

T(N — 2,) 
n 


Neyman has shown that minimizing such a statistic leads to estimates that 
ave essentially all the nice properties of maximum likelihood estimates. 
"s Important pleasant feature is that p» appears only in the numerator, 

w 


D rens in an ordinary 7? both numerator and denominator contain Pr 
ifferentiating the observed X? with respect to po, we get 


. (1224) Ox on > —2N*(x, — Npa) Pn 
Po EAN — x, po 


We need Pn as a function of po. For every trial n we can write 
m D, — An + Bapo 


where A, and B, are functions of & and a, alone and depend upon the 
Precise sequence of events. For example, 


(12.26) Di— Qypo = (1 — 23) + e ps, 


268 AN EXPERIMENT ON IMITATION cH. 12 
and so 4, = 1 — a, and B, = 4%. Table 12.1 shows that E, occurs on 
trial 1, giving 

(12.27) P» = QsQipo = %(1 — 94) + ts po 

Therefore, As = «a(l — a) and B, = a %,. Event £F, occurs on trial 2, 
giving 

(12.28) p,— Q10sQipo = cas 1 — 2) + (1 — 23) + zy po. 

In this way we can obtain 4, and B, in terms of x and x, for all values of 
n. Using expression 12.25 in equation 12.24 we get 


P3 ui sy — — 
(iss Oy Y 2N*%ty — NA, — NB PÀ) p 
ao 2,(N — 2.) 


n 
Setting this derivative equal to zero gives the equation 


t,—NA, o NB, 
(12.30) RES B, ^ eN x) 


"n 
n n 


This equation can be used to solve for the minimum observed 7? estimate 
Be However, we note that the individual trial estimates fy, which 


TABLE 12:3 


Computations used in obtaining the estimate Po from 
equations 12.20, 12.31, and 12.33. 


n d. B, Uy Pon W, W, Po, n 
0 0 1.000 7 0.350 0.2198 0.0769 
1 0.1600 0.8400 10 0.405 0.1411 0.0571 
2 0.0976 0.5124 Fi 0.493 0.0577 0.0284 
3 0.2420 0.4304 8 0.367 — 0.0386 0.0142 
4 0.3633 0.3615 5 —0313 0.0349 0.0109 
5 0.4651 0.3007 9 0.050 — 0.0186 0.0009 
6 0.5507 02551 10 0.199 0.0130 0,0026 
7 0.3359 0.1556 8 0.412 — 0.0050 0.0021 
8 0.4422 0.1307 7 0.705 0.0038 0.0027 
9 0.5314 0.1098 9 0.741 | 0.0024 0.0018 
10 0.6064 0.0922 13 0.474 0.0019 


0.0009 


11 0.6692 0.0774 12  -0.894 0.0013 0.0012 
12 0.7221 0.0650 15 0.429 0.0011 0.0005 
13 0.7665 0.0546 ıl 3.965 0.0006 —0.0024 
14 0.8040 0.0459 17 1.002 — 0.0008 0.0008 
15 0.8353 0.0386 13 4.801 — 0.0003 —0.0014 
16 0.8617 0.0324 15 3.448 — 0.0003 — 0.0010 


0.5412 0.1560 


SEC. 12.5 ESTIMATION OF po 269 
entered equation 12.20 may be written as 

(tN) — A, 
(12.31) Pon = Emm > 


n 


and so equation 12.30 may be written in the form 


NB, NB? 


(12.32) Pon = Bo 


Za 2,(N — x) 44 2,(N — x,)° 


Hence the weights W, in equation 12.20 are 
IR 2 
(12.33) w= aie à 
XN — m.) 

With these weights and the individual trial estimates of equation 12.31 
we can compute the grand estimate f, from equation 12.20. Table 12.3 
shows the computations for the estimation of po. The coefficients A, 
and B, of equation 12.25 were computed for trials 0 to 16, using the 
estimates 2, — 0.840 and 2, = 0.609 obtained in the last section. From 
Table 12.3 we finally get 
(12.34) po — 0.288. 

In the next section these estimates are used in measuring goodness-of-fit. 


E E ORA acd SETEFTEEETILTERRETETRETET 


0 5 10 15 20 25 30 35 40 45 50 
Trials, n 


Fig. 12.1, Proportions, p». Of imitations on each trial observed by Shwartz (broken 


lines) and computed probability, Pa, of imitation (solid lines). 


270 AN EXPERIMENT ON IMITATION cH. 12 


12.6 GOODNESS-OF-FIT 


Having estimated the three parameters, we can compute the probabilities 
Pn of imitation on each trial. We use 


Qipa = 0.840p, + 0.160, 
(12.35) Oop, = 0.609p,,, 
Po = 0.288. 


The results are shown in Table 12.4 along with the observed proportions 
of imitations on each trial. These are also shown in Fig. 12.1. 


TABLE 12.4 


Computed probability, p,, of imitation and observed 
proportion, p,, of imitations on each trial for Shwartz’s data. 


n Pn Pn Diff. n Pn Pn Diff. 
0 0.288 0.350 — 25 OSIL 0.700 — 
1 0.402 0.500 e 26 0.311 0.400 — 
2 0246 0.350 — 27 0.21 0.500 — 
3 0.66 0400 — 28 0.514 0.600 — 
4 0.67 0.250 + 29 0.592 0.650 — 
5 0.553 0.450 4 30 0.657 0.650 + 
6 0.624 0.500 3 0712 0350 = 
7 0.80 0.400 — 32 0434 0450 — 
8 0.480 0.350 + 33 0.24 0600 — 
9 0.563 0.450 + 34 0.600 0.450 + 
10 0633 0.650 — 35 0.664 0.500 
11 0.692 0.600 + 36 0.718 0.650 4 
12 0.741 0750 — 37 0.763 0300 + 
13 0.782 0.550 38 0.801 0.700 4 
14 0.817 0.850 — 39: 0.833 0750 + 
15 0.846 0.650 40 0.507 0.450 + 
16 0.871 0.750 - 41 0.309 0.400 — 
17 0.530 0.300 - 42 0.19 0.550 — 
18 0.600 0.750 — 43 0.512 0.600 — 
19 0.369 0.450 — 44 0.590 0.600 — 
20 0.470 0.600 = 45 0.656 0.750 — 
21 0.555 0.550 4+ 46 OTE 0550 4 
22 0.626 0.750 — 47 0.357 0.850 = 
23 0.686 0.650 - 48 0.461 0.400 } 
24 0.418 0.450 - 49 0.547 0.650 — 


SEC. 12.6 GOODNESS-OF-FIT 271 


The goodness-of-fit between the computed probabilities p, and the 
Observed proportions ñ„ may be determined by computing the Pearson 7? 
given by equation 12.22. This was done for the first 25 trials and the last 
25 trials separately. The values obtained are given in Table 12.5. The 


TABLE 12.5 


Results of chi-square computations for testing goodness-of-fit 
of model to Shwartz's data. 


Trials 0-24 Trials 25-49 Trials 0-49 


2 36.00 25.34 61.34 
d.f. 22* 22* 47 
P 0.03 0.28 0.08 


* The number of degrees of freedom was taken to be 22 in order to be 
conservative (see text). 


number of degrees of freedom was taken to be 47 for all 50 trials since 
three parameters were estimated from the data. To be conservative, 
We assumed that there were 22 degrees of freedom for the two halves of 
the data. The probabilities P obtained from 7? tables show that there is 
à significant difference between the computed and observed figures during 
the first 25 trials but not for the last 25 trials. One might conjecture that 
child 2 is not alerted to imitation as much in the first 25 trials as in the 
last 25, and hence that the model is more appropriate for the second half 
of the data. When all 50 trials are considered, the disagreement is not 
great—there is an 8 percent chance of a fit at least as bad as the obtained 
one, assuming the model is correct. 

The goodness-of-fit between the computed and observed proportions 
of imitations can be tested in another way. We can apply the “run test," 
described in Section 9.13, to see if the differences follow a systematic 
Pattern, In Table 12.4 we have noted whether the difference Pn — p, Was 
Positive or negative; 22 are positive and 28 are negative. We then counted 
up the number of runs of pluses and minuses. (A run is a consecutive 
Series of pluses or a consecutive series of minuses.) A total of 23 runs is 
observed in Table 12.4. Referring to Section 9.13, we find the expected 
number of runs is 25.64 and that the variance of the number of runs is 
11.89. This leads to a standard deviate of 0.77. Therefore we conclude 
that there is no significant systematic difference between the computed 
and the observed proportions. 

The expected total number of imitations during the 50 trials can be 
compared to the observed mean, though it is not much of a check on the 


272 AN EXPERIMENT ON IMITATION cH. 12 


model. In terms of the random variables x;,, defined by equation 12.3, 
the total number of imitations for the ith subject is 


(12.36) Ty= 3 


These variables x; „ are independent binomial observations, and so the 
the expected value of 7; is 


(12.37) EIT) = X Ein) = Xp. 


In other words, to obtain the expected number of imitations made by a 
single child, sum the probabilities p, over all trials. Table 12.4 gives 


(12.38) E(T,) & 28.535. 


In Table 12.2 we have tabulated the number of imitations made by each 
child. The mean of these is 


(12.39) T = 28.15, 


agreeing closely with the expected value just estimated. 
We may further compute the expected variance in the numbers T; 


from equation 12.36. The z,, are each independently drawn from a 


different binomial distribution. Hence, 
(12.40) P(T) = Y ex, ,), 


but the variance of x; „ is simply the binomial variance p,(1 — Pa) and so 
| na 


(12.41) eXT;) = > p,(1 — Pa). 


From the values of p, given in Table 12.4 we get 
(12.42) o°(T;) = 11.02. 


From the values of T; given in Table 12.2 for the 20 subjects we computed 
the variance and obtained P 


(12.43) G(T;) = 68.33. 


This observed variance is appreciably larger than the expected variance 
Our interpretation is that the subjects are not "identical"— that they did 


not have the same values of po, z,, and a. It is quite possible, however. 
that a more complicated model would account for the data better than 


the model used here. 
12.7 SUMMARY 


A simple experiment on imitation is described and the data 


: : analyzed 
in terms of experimenter-controlled events. à 


The schedule of events is 


|] 


REFERENCES 273 


the same for all subjects, giving a single probability (rather than a distri- 
bution) for each trial. Three parameters are estimated from the data 
and the goodness-of-fit of the model considered. It is found that events 
which inhibit imitation have a greater effect than events which increase it. 


REFERENCES 

1. Miller, N. E., and Dollard, J. Social learning and imitation, New Haven: Yale 
University Press, 1941. 

2. Schein, E. The effect of reward on imitation. Ph.D. Thesis, Harvard University, 
1952. 

3. Shwartz, N. An experimental study of imitation: the effects of reward and age. 
Senior honors thesis, Radcliffe College, 1953. 

4. Neyman, J. Contribution to the theory of the z? test. Proceedings of the Berkeley 
symposium on mathematical statistics. Berkeley: University of California Press, 1949, 
pp. 239-273. 


CHAPTER 13 


Symmetric Choice Problems 


13.1 INTRODUCTION 


This chapter deals with a class of experimental problems involving a 
Choice between two or more alternatives on each trial, alternatives with a 
certain type of Symmetry. Examples are the T- 
chooses to turn right or left, and 


pushes one of two buttons. 
that interchanging or renami 


maze in which a rat 
à "two-armed bandit” in which a person 
The Symmetry can be recognized by the fact 


ng the alternatives seems to change nothing 
of psychological importance, For example, whether “left” or "right" is 


the favorable side is of little interest. Indeed, many experimenters will 
randomize the favorable Side from rat to Tat to protect against initial 
position preferences. The outcomes of the possible choices, however, 
ordinarily will not be the same. One may result in reward and the other 


not, or one may pay off a certain Proportion of the time and the other a 
different proportion of the time, 


132 T-MAZE EXPERIMENTS 


An experimental arrangement that has received considerable attention 
It is one of the si 


à cus attention on the simple type of open 
maze used with rats, 


A diagram of the apparatus is shown in Fig. 13.1. A rat is placed at 
the starting position, s, and it runs to the choice Point, c, The rat then 
goes to one of two goal boxes, A, and A,, 


food. This sequence of behavior constitu 
procedure is ordinarily repeated man 
behavior of a rat on an experiment 
action. Moreover, from the point of view 
discussed in Chapter 2, the problem is not simple. 


274 


SEC. 13.2 T-MAZE EXPERIMENTS 275 


stimulus situation as it traverses the maze, and after passing the choice 
point is in one of two possible stimulus situations. This total behavior 
On a trial can be broken up, of course, and appropriate measures or 
indices used for each part. For example, we might ask about the latency 
of the starting position, s, or the running speed between s and the choice 
Point c. It seems to us, however, that the portion of the rat's behavior 
peculiar to this experiment is the behavior at the choice point, c. Which 
of the two alternatives does the rat choose? 

In the analysis which follows we ignore all aspects of a rat's behavior 
on a trial except the way a rat turns at 
the choice point. On each complete 42 c A 
experimental trial, the rat arrives at 1-p p 
the choice point where the population 
of stimulus elements is held constant 
from trial to trial. Two classes of 
Tesponses are defined corresponding 
to the goal box reached, A, or Ag. 

On each experimental trial one and 

only one of these response classes 

occurs, An experimental trial, then, : 

s un din en = Deos Fig. 13.1. Schematic diagram of the 
5 a pportunity for maze used by Brunswik [1] and by 

choosing among mutually exclusive Stanley [2]. The starting box is denoted 


and exhaustive alternatives. Accord- by s, the choice point by c, and the goal 
Ing to the model, the state of the boxes by A, and A.. The probabilities 


Organism on a particular trial iscom- — ^ 88 te gba omis s 
Pletely specified by a probability p S ^ 

that the rat will go to goal box A, and a probability q= 1 — p that it will 
80 to goal box 45. When these probabilities are known for every trial we 
nave complete information about the model of the learning process. These 
Probabilities can be estimated from the proportion of turns to goal box A; 
made by a Single rat on a number of trials or by the proportion of a 
Population of rats which go to goal box A, on a particular trial. 

In the experiments of Brunswik and Stanley various schedules of 
rewards and non-rewards were used on the two sides of the maze. For 
example, food was placed on one side of the maze on 75 percent of the 
trials and on the other side on 25 percent. In this example the reward 
Probabilities, 0.75 and 0.25, add to unity, but they need not and in some 
experiments did not. Stanley ran one group, for example, with reward on 
One side for 50 percent of the trials and reward on the other side for 0 
Percent of the trials. (In the following sections we denote such experi- 
Mental conditions by 75 : 25, 50 : 0, etc., where the first number gives the 


276 SYMMETRIC CHOICE PROBLEMS cH. 13 


reward probability on the more favorable side.) Brunswik 
did not actually use complete randomness in se 
schedules, but did use restricted randomness w 


and Stanley 
tting up the reward 
ithin blocks of trials. 


133 EXPERIMENTS WITH HUMAN SUBJECTS 


The experimental literature contains reports on numerous studies of 
human behavior in simple two-choice situations. Several of these were 
modifications and extensions of an experiment described in 1939 by 
Humphreys [3]. In those experiments, a subject was seated before a 
panel containing two light bulbs. On each trial, the left light was turned 
on; a few seconds later, on some trials the right bulb would come on, 
whereas on other trials the right bulb would not come on. The subject 
was to guess whether or not the right light would come on during each 
trial. In Humphreys’ original experiment, three Schedules were used: 
continuous reinforcement (right light turned on during each trial), partial 
reinforcement (right light on during half of trials), and extinction (right 
light on during none of the trials). 1f we identify event E, with turning 
on of the right light and event E, with not turning it on, we may describe 
the schedule by the two probabilities 7, and Tə Of events E, and E. 
respectively. Humphreys’ schedules were then 10 
(The first number gives the percent of E, occurr 

Experiments similar to Humphreys’ were reported b 
and Hornseth, and, in one of these, schedules of 0 : 
75:25, and 100:0 were used [4]. 


(In other st 
interval was varied, but no significant differences fr. 


found [5].) Note that in all groups the two event probabilities n 
summed to unity—the light either came on or it did not. 

Jarvik reported an experiment in which subjects Predicted on each tri 
whether the experimenter was going to say the word “ 
"plus" [6]. Schedules of 60 : 40, 67 : 33, and 75 : 2 
study. More recently, Hake and Hyman describe 
which subjects predicted whether a horizontal or a vertical set of lights 
would come on [7]. Their schedules were 50 : 50 and 75 : 25, but, in 
addition, a Markov dependence was introduced into the Sequences used 
with two groups of subjects. 

In all the human experiments described so far, 
occurred on each trial. If we care to think of confirr 
prediction as reinforcement of that guess, then one an 
possible guesses was reinforced on each trial. 
present in the Brunswik and Stanley rat exper 
preceding section, for in those experiments the 
the two responses could be varied independently 


0 :0,50 : 50, and 0 : 100. 
ences.) 


y Grant, Hake, 
100, 25 : 75,50: 50, 
udies, the intertrial 
om this source were 
ecessarily 


al 
Check" or the word 
5 were used in that 
d an experiment in 


one of two events 
mation of a guess or 
d only one of the two 
This restriction was not 
iments described in the 
Teward probabilities for 
- During the spring of 


SEC. 13.3 EXPERIMENTS WITH HUMAN SUBJECTS 277 


1951 we designed and Henry Gerbrands constructed a "two-armed 
bandit" to replicate the Brunswik-Stanley experiments, and Jacqueline 
Jarrett Goodnow carried out the replications. The apparatus consisted 
of a panel of lights and two push-button electrical switches as shown in 
Fig. 13.2. A large center light indicated the start of a trial. The subject 
inserted a poker chip to activate the machine and then pushed one of the 
two buttons; a light came on above that button. By means of an 
electromagnet one or more poker chips could be delivered to the subject 


Fig. 13.2. Sketch of the two-armed bandit used by Goodnow, Robillard, and the 
authors. When the machine is in operation, the three upper lights are on and a 
soft buzz may be heard. The subject pushes button A, or button Ag, and this 
turns on the light directly above the button pushed. Poker chips are dropped 
into the region behind the glass. 
in a box at the center of the apparatus. Ifa chip was delivered he “won,” 
if not he "lost." The pay-offs were not automatic but were controlled by 
the experimenter who sat in an adjacent room; the apparatus appeared to 
© automatic to the subjects. By appropriate pay-off schedules, various 
Partial reinforcement conditions were introduced. Schedules used by 
Goodnow and Laval Robillard include 100 : 0, 50 : 0, 75 : 25, 100 : 50, 
30:0, 80: 0, 80 : 40, 60 : 30.* (This apparatus was also used by David 


* The Optimum decision rule (the “best strategy") for playing the two-armed bandit 
for n trials Seems not to be known. 


278 SYMMETRIC CHOICE PROBLEMS cH. 13 


Carver to collect data for a senior honors thesis at Harvard. He studied 
the effects of differential amounts of reward on the two sides under partial 
reinforcement.) 

Detambel [8] and Neimark [9] had subjects push one of two telegraph 
keys to turn ona light. In Detambel's experiment, one light bulb was used. 
On each trial, each key was either correct or incorrect; when a correct key 
was pressed the light came on. In one of Neimark’s experiments, the 
same procedure was used except that there was a light bulb above each 
key. In both these studies, as well as in the two-armed bandit experiments, 
the outcome probabilities for the two responses could be manipulated 
independently. 

More recently, M. M. Flood has developed a simple punch-board 
technique of collecting data. Several columns of holes appeared on a 
punch-board, and the subject was told to punch one hole in each row, 
working from the top down. Behind the hole punched out was a special 
mark to indicate a win, or a different mark to indicate a loss. In one of 
Flood's original punch-board experiments, nine columns were used 
(nine-choice situation), and later Flood and Mosteller used these punch- 
boards for two-choice experiments. (With these boards a subject can 
punch at his own rate, and so trials may be strongly massed.) 

Another variation of the experimental design was developed by Bush, 
R. L. Davis, and G. L. Thompson. A subject was presented with two 
ordinary playing cards, face down, and was asked to turn over one of them. 
If the card turned over was red, the subject was given 
was black, the subject received nothing. 
for many trials. By ‘ A 


anickel; if the card 
: à The procedure was repeated 
: stacking” the cards in advance, various partial 
reinforcement schedules were obtained. This procedure was used with 
six Santa Monica (California) high-school students during the summer of 


1952, and with ten Harvard undergraduates and ten Cambridge (Massa- 


chusetts) high-school students during the fall of 1952 as part of a course 
in experimental social psychology at Harvard. 


The procedural differences among the many experiments mentioned 
above is rather important, we believe, in deciding how to analyze the data. 
In the studies by Humphreys, by Grant, Hake, and Hornseth, by Jarvik, 
by Hake and Hyman, and others, one of two environmental changes 
occurred on every trial independent of the subject’s response. Asa result, 
this procedure has been called non-contingent [9]: a light came on or it 
did not, a horizontal or vertical set of lights came on, or the experimenter 
said "check" or "plus." These procedures Suggest that we identify 
these two possible environmental changes with two events E, and E, in 
the model and use the case of experimenter-controlled events first intro- 


duced in Section 3.9. A different procedure was used in the studies by 


SEC. 13.4 MODEL WITH EXPERIMENTER-CONTROLLED EVENTS 279 


Brunswik and by Stanley, in the two-armed bandit experiments, in the 
study by Detambel, and in the punch-board and playing-card experiments. 
In those experiments, the environmental change depended upon the 
subjects response. The probability of presentation of food or of a poker 
chip depends upon which choice is made by the subject. Therefore, 
this procedure has been called contingent. lt suggests at once the case of 
experimenter-subject-controlled events first discussed in Section 3.13. 
The two possible outcomes would then be reward and non-reward, or 
"light on” and “no light on.” As a result, there would be four events 
corresponding to the possible response-outcome pairs. 

In the experiments performed by Neimark [9] both procedures were 
used. In the non-contingent groups, the schedules of appearances and 
non-appearances of the lights were independent of the subject's responses, 
whereas in the contingent groups, a light came on only when a subject 
pressed the appropriate key. This study permits a direct comparison of 
the two procedures. In addition, Neimark ran several groups of subjects 
with three keys and three lights. We discuss this three-choice problem in 
à later section. 

We have pointed out that the non-contingent procedure suggests the 
case of experimenter-controlled events, whereas the contingent procedure 
Seems to require the case of experimenter-subject-controlled events. 
However, both procedures could be analyzed using the case of experi- 
Menter-subject-controlled events. We might argue that pressing key 1 
and having light 2 come on was not the same event as pressing key 2 and 
having light 2 come on. Therefore, we would prefer to have an operator 
for each possibility and to determine from the data whether the two 
Occurrences had the same effect on behavior. Experimenter-subject- 
Controlled events are always more general than experimenter-controlled 
events. Nevertheless, for practical analysis of data, we find it convenient 
to reduce the number of parameters in the model, and so we assume that 
©xperimenter-controlled events are appropriate for experiments that use 
the non-contingent procedure. 


13.4 A MODEL WITH EXPERIMENTER-CONTROLLED 
EVENTS 
We employ experimenter-controlled events and equal alphas for des- 
cribing data from the non-contingent experiments. For most of these 
experiments, one of two environmental changes occurs on each trial; we 
identify these changes with events E, and Es, which correspond to oper- 
ators Q, and Q,, respectively. The two possible responses of the subject 
are identified with responses 4, and Ay. Furthermore, we take the limit 
Points of the operators such that if E; occurs on all trials, response 4; 


280 SYMMETRIC CHOICE PROBLEMS cH. 13 
will occur with a probability which tends to unity. The operators are 
then given by > 
Qip = ap + (1 — a) 

Oop = ap. 

Event E, occurs with probability 7, and event E, with probability 
T= l = m. 


In Section 5.3 we discussed such a model. In that section, the asymp- 
totic mean was found to be 


(13.2) V, 


1,o = 7. 


(13.1) 


This result implies that, after a large number of trials, the proportion of 
A, responses is equal to the Proportion of E, occurrences, In the next 
section we examine data from a number of experiments to see whether 
human subjects behave according to this prediction of the model. 
Although the main interest is in the value of the asymptote, we now 


describe a procedure for estimating the parameter x. In Section 5.3, the 
mean on trial n was found to be 


(13.3) Vin om — (0 — Vio)". 


This equation suggests a sim 


ple method of estimating à. Sum this 
expression from trial 0 to trial 


N — 1 to get 
N-1 
(13.4) bj Rin o M — ey, = gL 
em V fe 


If we have a way of estimating the left side of this equation and can also 
estimate V, 


1,0. We can then compute an estimate of x We next demon- 
strate how the sum on the left side can be estimated readily from data. 

Consider a random variable Xin Which is unity if the ith subject makes 
response A, on trial n and is zero otherwise. For each subject we then 
obtain a sequence of l's and O's. The expected value of x; „ if nothing is 


known about the ith subject's responses Previous to trial n, is 
(13.5) E(x; n) = Prit, n s 1} = vi n 
Next, define a statistic T, as the total number of 4, responses by the ith 
subject during the first N trials, that is, 


(13.6) 


The expected value is 
N= 


N-1 2 
(13.7) E(T) = X E(,)— y P 


n=0 n=0 


SEC. 13.4 MODEL WITH EXPERIMENTER-CONTROLLED EVENTS 281 
Hence we have, from combining equations 13.4 and 13.7, 


]—«* 


(13.8) E(T;) = Na, — (m, — Vi.) is 


The expected value of T, can be estimated by the mean number T of 4, 
Tesponses of K subjects during trials O to N — 1, that is, 


K 
(13.9) Fel jp 


i=l 
This leads to the estimation equation 
1—o% 
SE. 
We are often able to choose N large enough to make g” negligible com- 
Pared to unity. When this is so, we can solve for « in the foregoing 
equation to get 


(13.11) guo 8 


(13.10) T 5 Na, — (m — Vio) 


All that remains is to estimate Vio This initial mean may be estimated 
from data obtained during the first few trials, or it can be assumed that 
Vio = 0.5, for this simply means that the group of subjects has no initial 
Preference or bias. 

Both « and Vio can be estimated from equation 13.10 by choosing two 
Values of N. Denote these trial numbers by N; and Ng, and let 7, and T, 
be the Corresponding values of T. From the two equations of the form 
13.10, we can eliminate Vio to get 


(13.12) Nac Ti ,1—4 
Nam — Ta 1 — at 


This equation can be solved numerically for the estimate of «, and then 
Vio can be estimated from 


L.1—&« 

(13.13) Vio zm — (Num Ti) et 

We use these two equations in the next section. 

The model just described is adequate for describing most experiments 
that use the non-contingent procedure. However, Neimark introduced a 
Modification of the procedure in some of her groups [9], and she modified 
the model accordingly. Her subjects were required to guess which of 
two lights would come on during each trial, but on some trials neither 
light came on. These “blank trials," as she called them, constitute a 


282 SYMMETRIC CHOICE PROBLEMS cu. 13 


third class of events which we label Ej. If we assume an identity operator 
for such events, we have the set of operators given by 


Qip = ep + (1 — «) [71] 
(13.14) Qsp = ap [79] 
Qsp = p D= 7 — 7]. 


The probabilities of application of these three operators are shown in 
square brackets beside each equation. We now have a case of three 


experimenter-controlled events. From Section 4.4, equations 4.30, we 
see that 


a= 7,(1 — a), 
(13.15) a= 1—(n, +71 —0), 
i= — 
n 7, Tg 


From equation 4.28 we see that 
(13.16) A= Vio — (V, o — V, o)". 


Therefore the estimation procedures described above may be used to 
estimate z, and then the corresponding estimate of can be obtained from 


the second of equations 13.15. Equations 13.12 and 13.13 can be used 
by replacing « with % and z, with Vs eni 


NV, — T ]—aà^ 


N Vio — T, d—&gN 


(13.17) 


V Vig — S T) 4A. 
1a 
13.5 DATA FROM EXPERIMENTS USING THE 
NON-CONTINGENT PROCEDURE 
A number of experiments involving the non-contingent procedure are 
briefly described in Section 13.3. Data from some of these experiments 
are now examined, and the results 


i : i are interpreted in terms of the model 
described in the preceding section. 


In Humphreys’ experiment, the schedules 
and 0: 100. The data presented in his pap 


portions of guesses of “lights on" approached asymptotes of 1.0, 0.5, and 
0, respectively. These empirical asymptotes agree with those predicted 
by equation 13.2. Similar agreement is found in the data of Grant, Hake, 
and Hornseth [4], who used schedules of 0 : 100, 25 : 75, 50 : 50. 75 : 25, 


used were 100 : 0, 50 : 50, 
er [3] indicate that the pro- 


SEC. 13.5 DATA FROM NON-CONTINGENT EXPERIMENTS 283 


and 100 :0. Each group of subjects approaches a performance asymp- 
tote nearly equal to the probability that the light will come on. 

Two of the groups in the Hake and Hyman study [7] were presented with 
independent sequences of symbols with reward probabilities of 50 : 50 
and 75 : 25. These groups tended towards asymptotes of 0.5 and 0.75, 
respectively, in agreement with equation 13.2. The other two groups 
Were presented with Markov sequences of symbols; these groups also 
tended towards asymptotes equal to the proportions of symbols of one 
kind in the sequence. This result agrees with the model predictions in 
Section 5.11 which described the effects of a Markov sequence of operators 
with the equal-alpha restriction imposed. 


TABLE 13.1 


Data obtained by Jarvik [6] in an experiment in which subjects 
predicted whether the experimenter would say "check" or 
“plus” on each trial. Three groups of subjects were run with 
the values of 7,, the proportions of "check" announcements 
by the experimenter, shown. The proportion of “check” 
predictions are shown for each trial block. Also the estimates 
of « obtained from equation 13.11 are given. (These data 
are reproduced from the Journal of Experimental Psycholog y 
with the permission of the American Psychological 
Association.) 


Proportions 
Trial Block 
m = 0.60 7=067 m — 0.75 
0-10 0.456 0.459 0.456 
11-21 0.458 0.518 0.564 
22-32 0.461 0.641 0.667 
33-43 0.536 0.675 0.804 
44-54 0.615 0.611 0.798 
55-65 0.622 0.626 0.703 
66-76 0.555 0.579 0.771 
71-86 0.573 0.674 0.759 
Number of subjects 29 21 28 

& 0.982 0.974 0.952 


In the Jarvik experiment [6], subjects predicted whether the experimenter 
ould say “check” or "plus" on each trial. The data obtained are 


284 SYMMETRIC CHOICE PROBLEMS cH. 13 


presented in Table 13.1 and Fig. 13.3. As can be seen, the proportions of 
"check" guesses are tending towards the asymptotes predicted by equation 
13.2. Also, in Table 13.1 we show for each group the estimates of x 
obtained from equation 13.11. These estimates indicate that the speed of 
learning increases with increasing proportions of “checks” in the experi- 
menter’s sequence. 


Four of the sixteen groups of subjects studied by Neimark [9] received 


Vio = 0.75 


Vico = 0.67 
Vi œ = 0.60 


Proportions 


0 20 40 60 80 100 

Trials, 

The prediction data obtained by Jarvik oJ. The theoretical 
asymptotes are shown by the dotted lines. 


Fig. 13.3. 


à non-contingent procedure with two keys. The reward probabilities 
were 100: 0, 66:0, 66:34, and 66:17. In Table 132 we show the 


TABLE 13.2 
Results obtained by Neimark [9] from thr 
experiment. Theevent probabilities, 7. 
Vi z, the observed mean number, pH 
the observed mean number, Tos of 
computed estimates of x and V, 


ee non-contingent groups in her two-key 
14nd z,, the theoretical asymptotic means, 


» Of A; responses during the first 30 trials, 
A; responses during all 100 trials, and the 


10 are shown. There were 20 subjects in each 
group. 
Group m T VL T, Ts & Vio 
66:0 0.66 0 1.00 21.00 88.7 
: 21, 25 0921 0415 
86:17 — 056 — 017 0795 156) os Qs 031 
66:34 066 034 0.66 15.05 — 5650  Qog 0457 


SEC. 13.5 DATA FROM NON-CONTINGENT EXPERIMENTS 285 


results obtained by Neimark from three of these groups along with the 
estimates of « and V, o for each, computed from equations 13.17. Note 
that the estimates of æ increase as the difficulty of the discrimination 
increases, as we would expect. In Fig. 13.4 we show the data for blocks 
of five trials from the 66 : 0, 66 : 34, and 66 : 17 groups. The smooth 


10 = 
| 
R 9-—-o 
08 z —3; 
E MZ - Bm 
7 Jw- | 
ArT Te 
AT LN—1 
p e NUN 
95 A^ cL eZ Vv 
AJ/ 7; «Fd 
Via e ae NY 
" / 
0.4 
en L 
e— 66:0 
0.2 e--—9 66:17 
| *—-—e 66:34 | 
el-r doa L | | L | | | | L | 


0 10 20 30 40 50 60 70 80 90 100 
k Trials, n 
Fig. 13.4. Data Obtained by Neimark [9] with the non-contingent procedure, and 
theoretical means, /,,,, computed from equations 13.15 and 13.16 with the parameter 
Values shown in Table 13.2. The heavy horizontal lines are the theoretical asymptotes. 


Curves were plotted from equations 13.15 and 13.16, using the estimates of 
“and V, o given in Table 13.2. 

All experiments mentioned in this section used the non-contingent 
Procedure, and in all cases the empirical estimates of the asymptotic means 
are in close agreement with the prediction of the model described in the 
Preceding section. Furthermore, we are aware of no similar experiments 
that yield different results. However, we presented no measures of 
800dness-of-fit of the model to the trial-by-trial data. We merely esti- 


Mated the parameter « and then compared the computed mean curves with 
the data 


286 SYMMETRIC CHOICE PROBLEMS CH. 13 


13.6 A MODEL WITH EXPERIMENTER-SUBJECT- 
CONTROLLED EVENTS 


In Section 13.3 it was pointed out that the T-maze experiments with 
rats, as well as several human experiments, used a contingent procedure; 
the outcomes of trials were contingent upon the subjects’ responses. 
A model for these experiments uses experimenter-subject-controlled 
events discussed in Sections 3.13, 4.6, and 5.8. The two alternatives Ai 
and A, are identified with the two available Choices, for example, turning 
right: or left or pushing the right or left button. The two outcomes O, 
and O, correspond to reward and non-reward or correct and incorrect, 
respectively. (For simplicity, we speak of right and left turns and rewards 
and non-rewards; appropriate modifications for other experiments are 


required.) An event then is an alternative chosen and an outcome; we 
have four possible events as tabulated. 


Event Alternative Outcome Identification 


Ey Ay Oi right turn + reward 

Eis Ay Oz right turn + no reward 

En E Oi left turn -+ reward 

Ey A» Os left turn + no reward 
Corresponding to each event Ej, there is an operator Q;, defined by 


(13.18) Qrp = Ap + (1 — Hip )A sp 

We now place some restrictions on these four operators. 
The first restrictions result from the 

assume that reward of a ri 

left turn has on g = 1 — 


symmetry of the situation. We 
ght turn has the same effect on p as reward of a 
P. SO we require that 


Soy = Qu, = 
(13.19) a SA se 
de Ag Pay S s 
For the same reasons we wish non-reward to have a symmetric effect and 
so we take 
Son = (s = x > 
(13.20) 2 12 2: 


1 — 2a = ha = Rs, 
These four restrictions reduce our four operators to 
Qup— ap + (1 — t)i 
Qip = asp + (1 — 9.9)À5, 
Oop = p + (1 — a1— Aj), 
Qoop = aap + (1 — %)(1 — 23). 


(13.21) 


——— i MIRA S PS oÀ 12 


SEC. 13.6 EXPERIMENTER-SUBJECT-CONTROLLED EVENTS 287 


In an experiment in which reward always follows 4, and never follows 
Aa, the subject chooses A; with almost certainty after a number of trials. 
This means that operator Q,, is applied repeatedly after learning has 
occurred, and we want this operator to maintain p at or very near unity. 
Hence, we choose 2, = 1. Similarly, if the non-reward operator, Qo, 
Were applied repeatedly, the probability p should tend towards zero. 
This implies that 4, = 0. These restrictions reduce our four operators to 


Qup = ap + (1 — 4) [pr] 
= 1—7. 
(13.22) Qip = asp [pt 3l 
Qap = ap {a — py; 


Qsap = «sp + (1 —«9) [Q — pX1 — 73). 
Only two parameters, a, anda, remain. We callo, the reward parameter 
and æ, the non-reward parameter. 

These four operators characterize a reinforcement-extinction model 
because reward improves and non-reward reduces the probability of 
Boing to the chosen side. But if non-reward occurs in the presence of 
Stimuli associated with reward, secondary reinforcement can occur. If the 
Tight-hand sides of the second and fourth listed operators are interchanged, 
a secondary-reinforcement model results. This model has two absorbing 
barriers; asymptotically nearly all sequences terminate at zero or unity. 
Though we do not make essential use of either model here, we regard both 
as important, 

As in Section 3.13, the conditional probabilities of the outcomes, given 
a particular response, are constant. The probability of outcome O,, given 
alternative A,, is denoted by mj. We drop the double subscript for 
Simplicity and let 
(13.23) 


m = Ty = 1 — Tis 


Ty = Ty = l — Tez. 
oe €vent probabilities are as shown in brackets to the right of equation 

3.22, 

We next consider two alternative additional restrictions on the model 
First we consider the equal alpha condition (%, = «5) and then we discuss 
the condition % = | already considered in Section 8.6. 

The instructions given a human subject in a two-choice experiment 


` May lead him to believe that non-reward of one response necessarily 


implies that the other response would have been rewarded if it had been 
Made. Such an assumption by the subject may result from direct in- 
Struction or implication, or may be made without the experimenter 
Suggesting it to him, The schedule used in the experiment may be such 


288 SYMMETRIC CHOICE PROBLEMS cH. 13 


that the assumption is correct, or it may not be, but the subject often has 
no opportunity for testing the assumption in an unambiguous way. 
When this assumption is made by a subject, it seems plausible to assume 
that a rewarded 4, response has nearly the same effect upon future 
behavior as a non-rewarded A, response and vice versa. This in turn 
implies the equal alpha condition, 


(13.24) Oy = Xy — X. 
The operators of equations 13.22 then become 
Qup = ap + (1 — a), 
(13.25) Sapmi 
Qap = op, 
Qop = ep + (1 — a). 


In Section 5.9 it was shown that these operators lead to the following 
explicit formula for the means: 


(13.26) Vin= Vio — (Viro — 


where 


Vig! — (2 — m, — 3X1 — a)", 


(13.27) OR. NM 
2— m — 7$ 
Note that 7, + 7, need not be unity. This equation for the asymptotic 
mean provides us with a direct way of telling whether the condition hy = He 
is appropriate for any given set of data. In the following two sections 
we present several sets of data which clearly do not satisfy this condition. 

When we regard the outcomes as reward and non-reward rather than as 
information about the pay-off schedules, there is no intuitive reason for 
assuming that the reward and non-reward parameters x, and a» are equal. 
Certainly, when we think of rat experiments, we have little basis for 
discussing assumptions made by the subject. We do not care to think of the 
rat as collecting data and making decisions accordingly or as reasoning 
that the other choice would have paid off when he failed. Instead, we 
prefer to think of the rat as undergoing a simple conditioning process 
with reward leading to an increase in response strength. 

Estimating both o, and æ; from a set of data leads to difficult com- 
putational problems, and we have not found a wholly satisfactory pro- 
cedure for obtaining such estimates. However, the data we present in 
the following two sections give some support to the assumption that 
% is very near unity. A closed expression for the asymptotic mean in terms 
of æ and g is not available, but the expected operator approximation is 


SEC. 13.6 EXPERIMENTER-SUBJECT-CONTROLLED EVENTS 289 


suggestive (cf. Section 6.4). From equation 6.24, we can show that the 

expected-operator asymptotic mean is given by the formula 

(13.28) Vy, 
m (0, — 7.) — 2(1 — mae + Vm — Ta) + 4(1 — Tol — m), 


t 
E 27, — mal — +) 
where 


eo 


1a oe 


(13.29) gem 


# i. 


1 ty 


For the 50 : 0 case (7, = 0.5, 72 = 0) this reduces to 


fi 2 

(13.30) z as —4r+ V1 + 8 : 
"E a1 — x) 

When a, = 1, we have x = 0 and I, = l. But when x = 0.5, the last 
formula gives Vj. cx 0.732. Moreover, we already know that for 
% == x, that is, when «= 1, that Vi, = 0.667 for the 50:0 case. 
(Even if ,, = 1, V, is not generally exactly unity unless 7, = 0, because 
the model guarantees that some p-value sequences are "absorbed" at 
P — 0. This implies that a small proportion of animals will stabilize on 
the unfavorable side of the maze unless z, = 0.) 

Some of the 50 : 0 data described in the next two sections show that 
Vy. can be at least as high as 0.9. To obtain such a large value of V, ,, 
would require a value of in the foregoing formula as small as 0.1. Asa 
result we assume that x= 0, that is, that x= l. 

If a = 1 is used, equations 13.22 simplify to 


Onp= up t+ - 23). 


(13.31) Qip = Ps 
Onp = %Ps 
Qp = P- 


The problem then is to estimate the remaining parameter o from data. 
And having estimated a4, we wish to compute the means V’,,,, to compare 
With the trial-by-trial means from the data. 

In Section 8.6 we derived the approximate formula 
Vio 


7 
= —(s— Fl N 
la A +(1- V. je 772 pn 


(13.32) y 
From this equation we can compute the mean FA, , for any trial in terms 
Of the initial mean, the known values of 7 and 75, and the parameter o. 
We use this equation in the following two sections. 


290 SYMMETRIC CHOICE PROBLEMS cH. 13 


The one remaining task before proceeding to an analysis of data is to 
develop a method for estimating the reward parameter o}. We use a 
technique employed previously in other problems. The total number of 
errors (choices of alternative A,) on trials 0, 1, ---, K — 1 can be esti- 
mated by the integral 


K 2: V 0 
(13.33) Í a= Vy g)idn [ f : dn. 
0 Jo 


epe V, e - 1-720 a 


The result of the integration is 


(13.34) (as Vi) dn — EU + (1 — V, etm) ak] 
. lai = : 
0 . 


(7, — Tal — o4) 
For large numbers of trials K we have, approximately, 
K 
(13.35) h Hd. E 
0 d (7, — Tal — 94) 


If we denote the total number of errors made by a subject (or the mean for 
a group of subjects) by 7,, we have the estimation equation 


(13.36) fe EN. 
(7, — 741 — o) 
or 
—log V4 
(13.37) C E pei: ANM 
à G5 — mT, 


When K is not large enough to permit us to use the large-trial approxi- 
mation, we substitute T, for the integral on the left side of equation 13.34 
and solve numerically for the estimate of 9a. 

For most of the experiments to be 


analyzed, it is satisfactory to take 
Vio = 1/2. The symmetry 


of the experimental responses makes this 
reasonable when groups of subjects are Considered, at least. Nevertheless, 
we may have occasion to examine the data to see if 0.5 is a good estimate 
of Vi, This is most easily done by plotting the data on semi-log paper; 
the way to make the plot may be seen by using equation 13.32 above. 
Let V, , be estimated by p, the proportio: 


! ) n of subjects that make response 
A, on trial n, and obtain from equation 13,32 
1 — 5, 1— 
(13.38) c": aia Ne enrad 
Pr 1.0 


Taking logarithms, we obtain 


1— ÀJ]. 1— V, 
(13.39) tog | = ete | d (m — m)(1 — a)n. 


SEC. 13:7 THE STANLEY T-MAZE DATA , 291 


Thus, if we plot (1 — j,)/p,. obtained directly from the data on the log 
scale, versus 7 on the linear scale of semi-log paper, a straight line with 
intercept (1 — V4 9)/V,. y at n = 0 should be obtained. We suggest that 
only the data from the early trials be plotted and that a straight line be 
drawn by eye through the plotted points to obtain the intercept. The 
initial mean V, o can be obtained directly from the value of the intercept. 


137 THE STANLEY T-MAZE DATA 


The T-maze experiments with rats, conducted by Brunswik [1] and 
Stanley [2], were described in Section 13.2. In this section we analyze 
some of the data obtained by Stanley. Seven rats were used in each group, 
and we are concerned with the 100 : 0, 50 : 0, and 75 : 25 groups. In 
Table 13.3 we give the observed proportion of turns to the favorable side 
for each of these groups during each day of the experiment. Eight trials 
Were run on each day. 

Stanley controlled the number of rewards for each rat instead of the 
total number of trials, and so the number of rats remaining in the experi- 
ment decreases as shown in Table 13.3. For the 100 : 0 group, Stanley 
ran each rat until the animal went two days, not necessarily consecutive, 
without errors. A rat in each of the other groups was matched by the 
litter-mate technique to a rat in the 100 : 0 group. Sianley ran a 50 : 0 
rat, for example, until it received the same number of rewards as its 
litter-mate in the 100 : 0 group. Hence the unequal numbers of trials for 
the various animals. We describe this procedure in some detail because 
We wish to use the data beyond the point where the first rat drops out of 
the experiment in each group. We would expect that proportions of 
favorable turns given in Table 13.3 would have been larger on the later 
days of the experiment if Stanley had kept all rats in the experiment to the 
end. Certainly in the 100 :0 group the animals that learned fastest 
dropped out of the experiment first, and so the proportions beyond day 
5 shown in Table 13.3 for that group are clearly biased in the downward 
direction. Whether or not a similar bias exists in the 50 : 0 and 75 : 25 
groups depends on whether differences between litters are large or not. 
If the matching technique used were in fact effective and if some litters 
Were "smarter" than others, surely a bias would result. On the other 
hand, if the matching technique had little effect or if the observed differ- 
ences were the result of statistical “accidents” such as those observed in 
Our "stat-rats" (cf. Section 6.2), then there would be no reason to expect 
an appreciable bias. We do not try to settle this question but merely 
Point out the arguments on both sides. The important point is that the 
direction of bias, if one exists, is such that the proportions given in Table 
13.3 are too small after some rats have dropped out (N — 7). 


292 SYMMETRIC CHOICE PROBLEMS cH. 13 


(A lower bound of 0.955 for the learning parameter in the 100 : 0 group 
was obtained by recomputing as if rats reaching the criterion continued 
with perfect performance instead of dropping out.) 


TABLE 13.3 


Summary of data obtained by Stanley [2] on three groups of seven 

rats each. There were eight trials each day. The proportions 

of turns to the favorable side by each group on each day are 

shown. (As discussed in the text, some rats were run through 

more trials than others.) The estimates of 4, were obtained 
from equations 13.34 and 13.37. 


D 100 : 0 Group 50 : 0 Group 75 : 25 Group 

a 

d N Proportion | N Proportion | N Proportion 
l (i 0.571 7 0.518 7 0.500 
2 7 0.696 7 0.553 7 0.554 
3 7 0.714 7 0.661 T 0.589 
4 7 0.821 7 0.696 7 0.589 
5 T 0.804 7 0.643 7 0.643 
6 6 0.833 7 0.661 7 0.714 
7 6 0.937 7 0.714 7 0.750 
8 4 0.844 7 0.821 6 0.792 
9 4 0.875 7 0.786 6 0.792 
10 4 0.875 7 0.893 4 0.844 
11 3 0.875 6 0.833 4 0.875 
12 3 0.917 6 0.875 4 0.813 
13 6 0.854 4 0.782 
14 6 0.854 3 0.834 
15 5 0.850 3 0.917 
16 4 0.875 

17 4 0.906 

18 4 0.875 

n 4 0.937 

20 4 0.937 

21 3 0.917 

22 3 0.917 

23 3 1.000 

24 3 0.958 

oy 0.961 0.962 0.964 


The equal-alpha condition described in Sect 
by Stanley's 50 : 0 and 75 : 25 groups. 


ion 13.6 is clearly not met 


The equal-alpha asymptotes are 


——— a 


SEC. 13.7 THE STANLEY T-MAZE DATA 293 


0.667 and 0.750, respectively, for those two groups. On days beyond the 
sixth, the 50 : 0 proportions exceed 0.667 on all days. Similarly, on all 
days beyond the seventh, the 75 : 25 proportions exceed 0.750. Further- 
more, the data in Table 13.3 strongly suggest that all groups are ap- 
proaching an asymptote of nearly 1.00. Therefore, we apply the model 
with a. = 1. 

The initial mean V, is taken to be 0.5 for Sianley's 100 : 0 and 50 : 0 
groups. The data for the 75 : 25 group suggest that Vio is slightly less 
than 0.5 and so we plotted (1 — p,) p, versus n on semi-log paper as 


1.0 


Proportions 


o 
u 


100 120 140 160 180 200 


Trials, n 


0 20 40 60 80 


Fig. 13.5. The 50 : 0 data obtained by Stanley 2] from seven rats. The smooth curve 


[ 
Was computed from equation 13.32 with zi ~ 0.5, ms = 0, Via = 0.5, and z, = 0.96. 


described at the end of the preceding section. This led to the estimate 
Vio = 0.48 for the 75 : 25 group. : 

The reward parameter z, Was estimated for each group by using 
equations 13,37 and 13.34 of the last section. An approximate value of 
2; was first obtained from equation 13.37, and then equation 13.34 was 
Solved numerically for each of the three groups. The quantity 75 was 
defined to be the mean total number of errors per subject, but caution 
Must be used in computing this quantity when the number of subjects is 
Not the same on all trials. The simplest procedure is to estimate the 
integral on the left side of equations 13.34 and 13.35 by summing the 
Observed proportions of errors on each day and multiplying by 8, since 
lhere were eight trials perday. In this way we obtained the estimates of 
% from Stanley's data shown in Table 13.3. These three estimates are 
Temarkably close to one another, and within the framework of the model 
this implies that the effect of a reward is independent of the reinforcement 


294 SYMMETRIC CHOICE PROBLEMS CH. 13 


Schedule used. In the next section we see that this result does not hold 
for data obtained from human subjects. 

Using the estimated values of æ}, we computed the approximate values 
of the trial means for each group from equation 13.32. In Figs. 13.5 and 
13.6 we show the results for the 50 : 0 and 75 : 25 groups along with the 
experimental points. 


F 
0.9 


Proportions 
o 
N 


20 40 60 
Trials, n 


Fig. 13.6. The 75 : 25 data obtained by Stanley [2] from seven rats, The smooth 


curve was computed from equation 13.32 with 7; = 0.75, m, = 0.25, Vio = 0.48, 
and a, = 0.96. 


80 100 120 


13.8 HUMAN EXPERIMENTS USING THE 
CONTINGENT PROCEDURE 


In this section we discuss several experiments usin 


hoice situations. 


Stanley described in the preceding section. 
of five Harvard undergraduates. The rew 
100 : 0, 50:0, 100: 50, and 75:25. F, 
ditions, two different procedures were used: 
In the pay-to-play procedure, the subject’s net Pay was computed by 
subtracting his losses from his wins. In the play-free Procedure, a sub- 
ject’s net pay was simply the number of chips won. The chips were 
worth one cent. Each subject in the two 100 : 0 groups was run for 
75 trials whereas all other subjects were run for 150 trials of acquisition. 


or the 100 : 0 and 50 : 0 con- 
"pay to play" and “play free.” 


SEC. 13.8 HUMAN EXPERIMENTS USING THE CONTINGENT PROCEDURE 295 


In addition, each subject received 50 trials of extinction, but we delay 
discussion of those data until a later section. 

Table 13.4 gives the proportions of choices of the more favorable side 
made by all subjects in each group during blocks of 10 trials. As in the 
last section, we see that the equal alpha condition is not satisfied in these 
data; for the 50 : 0 groups the equal alpha asymptote is 0.67, and for the 


TABLE 13.4 


Two-armed bandit" data obtained by Goodnow from six groups of five subjects 
each. The reward probabilities and the two procedures, “pay to play" and 
PY free," identify each group. For each block of ten trials the group propor- 

tons of choices of the favorable side are given. Theestimates of x; were obtained 

from equations 13.34 and 13.37. 


a 100:0 100:0 50:0 50:0 100:50 75:25 

rial Block Pay to Play  Payto Play Pay to Pay to 
Play Free Play Free Play Play 

uL 

0-9 0.4 038 044 0.44 0.54 0.42 
oe 084 066 © 0.0 0,50 0.54 0.68 
8 -29 1.00 0.90 0.60 0.70 0.64 0.54 
pd 1.00 — 0.88 0.70 — 0.68 0.66 0.78 
ios 1.00 0.90 0.70 0.58 0.84 0.64 
F 1.00 0.88 0.82 0.74 0.96 0.68 
ed 1.00 0.99 0.86 0.60 0.90 0.66 
8 Bi 0.90 0.68 1.00 0.80 
-89 0.88 0.76 1.00 0.80 
90-99 0.80 — 0.70 1.00 0.88 
100-109 0.94 0.76 1.00 . 0.92 
110-119 0.78 0.68 1.00 0.90 
120-129 078 0.84 1.00 0.92 
120-139 082 0.76 — 1.00 0.82 
O=149 090 0.74 1.00 0.86 

e 0.888 — 0.951 0.962 0.979 0.928 0.967 


T For the 100 : 0 groups this includes trials 60-74. 


x E group it is 0.75. The data in Table 13.4 for these groups clearly 
M E those asymptotes. (The 100 : 0 and 100 : 50 groups should have 
ie Pot of 1.00 for either the equal alpha condition or the condition 
%ə = I discussed in Section 13.6.) 
9r all six groups we assumed that V, o= 0.5 and «, — 1, and we 


296 SYMMETRIC CHOICE PROBLEMS cu. 13 


estimated a, from equations 13.34 and 13.37. We obtained the estimates 
shown in the last row of Table 13.4. In Fig. 13.7 we show the data for 
the 75 : 25 group along with a curve of V,,, computed from equation 
13.32, with V, o= 0.5 and x, = 0.967. 

Two inferences may be drawn from the estimates 5, for the six groups. 
First, reward had a larger effect on the probability of choosing the more 
favorable side (2; was smaller) for subjects who paid to play than for 
subjects who played free. This is true for both the 100 : 0 and 50:0 
conditions. Second, the effect of a reward was much greater in the 
continuous reinforcement cases (100 : 0) than in the corresponding 


WE T T] 


0.9 


0.8 [ 


0.7 


Proportions 


0.6 1 1 


05 2 


LLL d i. i 1] 
0 20 40 60 


Trials, n 


04 


Fig. 13.7. The 75 : 25 data obtained by Goodnow from five Harvard students, using 
the two-armed bandit with the pa 


à y-to-play condition. The smooth curve was computed 
from equation 13.32 with z, = 0.75, 72 = 0.25, V, = 0.5, and a, = 0967. 


partial reinforcement cases. For the pay-to-pl 
decreasing effect of reward was 100 : 0. 


the order of our intuitive guesses about the order of increasing “‘am- 
biguity" of the situations, that is, the 100 : 0 seems to be the easiest 
discrimination to learn and the 75 : 25 the most difficult 


50 : 0 and 100 : 50 conditions are symmetric in pay-off and non-pay-off, 
but the subject is presumably searching for the more favorable side and so 


experiences the consistent success side more often than the consistent 
failure side. 


ay procedure the order of 
100 : 50, 50:0, 75 : 25. This is 


In a sense, the 


Another set of experiments using the two-armed bandit was conducted 
by Laval Robillard as part of a senior honors thesis at Harvard in 1952- 


——————————————————( 
- — E - = ~ "— 


SEC. 13.8 HUMAN EXPERIMENTS USING THE CONTINGENT PROCEDURE 297 


1953. He used seven groups of 10 Harvard freshmen as subjects. The 
play-free procedure was used for all groups but no chip was used to activate 
the machine. The first part of Robillard's set of experiments was a study 
of the effect of the amount of reward in the 50 : O situation. Subjects in 
one group were told to win as many chips as possible but were not told 
that the chips won would be exchanged for money. Another group was 
told that each chip was worth one cent, and a third group was told that 
each chip was worth five cents. The chips were exchanged for money at 
the end of the experiment. Each subject was run for 100 trials of ac- 
quisition. Four additional groups were run without monetary value 
placed on the chips and with reward probabilities of 30 : 0, 80 : 0, 80 : 40, 
and 60:30. Extinction trials were run on all 70 subjects, but we delay 
discussion of the extinction data until a later section. 


TABLE 13.5 


Two-armed bandit" data obtained by Robillard from seven groups of ten 
Subjects each. The reward probabilities and monetary value of each chip are 
Shove for each group. For each block of ten trials the group proportion of 
Choices of the favorable side is given. The estimates of x, were obtained from 

equations 13.34 and 13.37. 


Trial Block 50:0 50:0 50:0 30:0 80:0 80:40 60:30 
(0) að GO O (0 (09 (00 
0-9 0.52 051 049 0.56 0.59 0.50 049 
10-19 063 054 038 057 077 0.59 0.58 
20-29 069 067 067 055 088 071 062 
30-39 063 oso 059 063 0.88 064 OSI 
30-49 oea 066 063 060 086 063 OSI 
30-59 075 066 064 066 091 063 Ol 
60-69 076 07] 0271 0.65 092 053 0.57 
2079 085 070 073 0.65 089 07! 0.57 
30-39 087 083 072 065 0.88 073 0.65 
90-99 090 ogs Os 066 089 070 0.5 

“i 0.958 0.969 0.973 0.967 0.943 0,975 0.975 


acquisition data obtained by Robillard. 


Table 13.5 summarizes the 1 
| subjects in each 


n ; 2 
aat tong of choices of the more favorable side for al 
pied are given for blocks of 10 trials. We assumed V} o= 0.5 and 

Stimated o. for each group from equation 13.34. The estimates obtained 


298 SYMMETRIC CHOICE PROBLEMS CH. 13 


are shown in the last row of Table 13.5. In Fig. 13.8 we give an example 
of the data from Robillard’s experiments along with the curve of Vin 
versus n computed from equation 13.32. 

Again, two inferences may be drawn about how the effectiveness of a 
success trial depends on certain experimental conditions. First, we see 
that success has the greatest effect when the stakes are lowest. A com- 
parison of Robillard's three 50 : 0 groups shows that success is most 
effective when the chips have no monetary value and is least effective when 
each chip is worth five cents. The differences are small and probably not 
significant, however. The second inference has to do with the effectiveness 


10 


0.9 


08 


Proportions 
o 
N 
— 


0.6 = 


0.4 


0 20 40 60 80 100 
Trials, n 
Fig. 13.8. The 50 : 0 data obtained by Robillard from ten Harvard freshmen, 
using the two-armed bandit, This group received no money in exchange for the 
chips. The smooth curve Was computed from equation 13.32 with m, = 0.5, 
m — 0, Vio = 0.5, and 9, = 0.958, 


the order of decreasi 


À O : 0, 80 : 40, 60:30. (The last two groups 
gave identical estimates of «,.) 


A similar experiment used the playing card technique described in 


Section 13.3. Briefly, a subject was Presented with two playing cards 
face down on each trial and asked to choose the red card; if he succeeded 


he was paid five cents. Twenty subjects were run by students in a course 
in experimental social psychology at Harvard in October, 1952. The 


SEC. 13.8 HUMAN EXPERIMENTS USING THE CONTINGENT PROCEDURE 299 


50 : 0 condition was used, and each subject was run for 100 trials. One 
group of 10 subjects were Cambridge high-school students, and another 
group of ten subjects were Harvard undergraduates majoring in mathe- 
matics, physics, or astronomy. Table 13.6 summarizes the data obtained, 


TABLE 13.6 
Data obtained from the playing-card experiment 
with ten subjects in each group. The proportion 
of choices of the favorable side by each of the two 
groups is shown for each block of ten trials. The 
estimates of x, were obtained from equations 13.34 


and 13.37. 

Trial High-School Harvard 
Block Students Students 

0- 9 0.47 0.59 
10-19 0.63 0.72 
20-29 0.64 0.77 
30-39 0.71 0.75 
40-49 0.74 0.76 
50-59 0.71 0.82 
60-69 0.77 0.83 
70-79 0.76 0.87 
80-89 0.89 0.94 
90-99 0.91 0.96 

& 0.958 0.934 


esi 


?nd Fig, 13.9 shows the data for the high-school group along with a 
computed curve of 4, versus n. As in the previous two sets of experi- 
ments, the data exceed the equal-alpha asymptote of 0.67 during all 
except the early blocks of trials. 

Sn mean initial probability was taken to be 0.5 for both groups. 
: he number of correct choices made by each group during the first five 
rials was 21 out of a possible 50, but since the standard deviation of the 
rae Peeled number of successes in 50 binomial observations with p = 0.5 
oe 3.5, we saw no reason for not taking V1,o = 0.5. Furthermore, 
i ere is no evidence that the two groups of subjects had different mean 
mitial probabilities of choosing the favorable card. The success para- 
Meters « were estimated by using equation 13.34, and the results are given 
m Table 13.6. The data show that the college undergraduates “learned 
aster” than the high-school students. 

In Neimark’s study [9], four groups of subjects were run with the 
Contingent procedure in a two-choice situation. The experimental 


300 SYMMETRIC CHOICE PROBLEMS cH. 13 


method was described in Section 13.3. With reward probabilities of 
66:0 and 66:17, the group performance exceeded the equal-alpha 
asymptotes, as in other investigations described in this section. However, 
Neimark's 66 : 34 group seemed to be approaching the equal-alpha 
asymptote of 0.66. This agrees with the results from Detambel's 50 : 0 
group [8], which also seemed to approach the equal-alpha asymptote 
of 0.67. 


Except for the data from the Neimark 66 : 34 group and the Detambel 


PLI IC 
0.9 J 1 


o 
œ 


Proportions 
oO 
u 


id 
a 


05H + 


[113 
0 20 40 60 80 100 
Trials, n 


Fig. 13.9. The 50 : 0 data obtained from the playing-card experiment with ten 
Cambridge high-school students. The smooth curve was computed from 


equation 13.32 with m, — 0.5, m, = 0, Vio = 0.5, and x, — 0.958. 


50 : 0 group just mentioned, all d 
the contingent procedure appear 
condition. The identity Operato 


ata from two-choice experiments using 
to be inconsistent with the equal-alpha 


sca lor assumption for non-reward, on the 
other hand, leads to a model which is consistent with the bulk of the data 


presently available. However, we have not seriously tested the identity- 
operator assumption with the data: rather, we have shown that several 


sets of data indicate that | Z is small compared to | — ø}. The two 
notable exceptions cannot be ignored of course. 


*13.9 THREE-CHOICE EXPERIMENTS 


As noted in Section 13.3, Neimark ran several groups of subjects in & 
three-choice situation. She used three telegraph keys and three lights 
with both the non-contingent and contingent procedures. | 


SEC. 13.9 THREE-CHOICE EXPERIMENTS 301 


For the non-contingent groups, the schedules of appearance and non- 
appearance of lights were independent of the subjects’ choices. The 
reward probabilities were 100:0:0, 66:0:0, 66:17:17, and 
66 : 8.5 :8.5. A model for this part of Neimark's experiment can be 
obtained from the case of experimenter-controlled events. In Sections 
4.4 and 5.3 we discussed this case for more than two responses and more 
than two events. The symmetry among the three events suggests the 
equal alpha condition considered in Section 5.3. Furthermore, it seems 
plausible to choose the limit vectors of the three event operators so that, 
if event E, occurs on every trial, response 4, will occur with a probability 
Which tends to unity. Hence we assume that 


1 0 0 
(13.40) M= [0|, X. [l| = |0 
0 0 1 


Equation 5.20 then gives, for the asymptotic vector of marginal means, 
mi 
(13.41) Vio = |ala 


T3, 


where z, is the probability of occurrence of event Ej. From this result 
We conclude that the asymptotic proportion of A, responses is 7, and this 
result agrees with Neimark's data from her 100 : 0 : 0 and 66 : 17 : 17 
groups, 
4 In Neimark's 66:0:0 and 66:8.5:8.5 schedules, a number of 
blank" trials appeared, that is, trials on which no light appeared. Thus 
We need to introduce a fourth event £, with probability z, which has the 
Values 0.34 and 0.17, respectively, for the two groups. Asin Section 13.4, 
We assume an identity operator for this event. This assumption, along 


With those previously made in this section, allow us to obtain from equation 
4.39 the result 


1 m 


T 


(13.42) V 


1.9 


2 


my + eT 8 |. 


Hence the asymptotic proportion of 4, responses for Neimark's 66 : 0: 0 
and 66:85. 8.5 groups should be 1.00 and 0.795, respectively. These 
Conclusions are in close agreement with Neimark's data. 

To describe the data from Neimark’s three-key contingent groups, we 
need a model based upon experimenter-subject-controlled events. We 
Nave three responses corresponding to the three keys, and two outcomes, 


302 SYMMETRIC CHOICE PROBLEMS cH. 13 


reward (light on) and non-reward (light not on). From the symmetry of 
the situation we assume 


Oy, — ta — X3; — 04, 
(13.43) 


Oyo = pp = Ago = Mo. 


As before, we assume that reward can lead to perfect learning and so we 
take 


l 0 0 
(13.44) 24i — |O], Aa ||. Ast = [0]. 
0 0 1 


When we assume the identity operator condition for non-reward 
(% = 1), it seems likely that the asymptotic vector of marginal means is 


1 
(13.45) Vio 2210 
0 


provided m > 75 +- 74 by a fair amount. This conclusion seems in- 
consistent with Neimark's data on her contingent 66 : 0 : 0, 66 : 34 : 34, 


and 66:17:17 groups; those groups did not appear to approach an 
asymptote of 1.00 for response A. 


Except when we take x, = 1, we need to 
non-reward limit vectors Aj. When we assume that if A; occurs and is 
not rewarded, its probability will tend towards zero, a special assumption 
needs to be made about how the probability divides up among the other 
responses. In Section 5.10 we introduced one possible assumption by the 


second of equations 5.108. For the model under discussion, that assump- 
tion becomes 


0 0.5 0.5 
(13.46) A= JOS]; A= | 0 5 Keyes losi 
0.5 ^ 10:5 


make an assumption about the 


0 
When these limit vectors are assumed and we further let & = «9, We can 
compute the components of the asymptotic vector of marginal means from 
equation 5.113 and obtain for that vector 
1 
(13.47) Vio = 
(1 — maA — m) + (1 — a) — m3) + (1 — z,X1 — 72) 


[e — 7)(1 — 3) 


(1 — m1 — 3) 
(1 — m1 — 73) 


For the reward probabilities of 66 : 0 : 0, 66 : 34 : 34, and 66 : 17 : 17; 


SEC. 13.10 EXTINCTION DATA 303 


this equation gives asymptotes for A, of 0.60, 0.50, and 0.56, respectively. 
These computed asymptotes seem clearly less than the corresponding 
approximate asymptotes obtained from Neimark's data. 

Neither set of assumptions just considered for the three-key contingent 
Case is consistent with Neimark's data. The identity operator assump- 
tion for non-reward leads to asymptotes for 4, that look high, whereas 
the equal alpha condition together with the special assumption about the 
non-reward limit vectors (equation 13.46) leads to asymptotes for A, that 
are too low. 


13.10 EXTINCTION DATA 


In both sets of experiments on the two-armed bandit described in 
Section 13.8, the acquisition trials were followed by a number of extinction 
trials during which neither side led to reward. We now wish to look at 
these extinction data and make some inferences about the rate of extinction 
following the various reinforcement schedules. But before analyzing the 
data, we examine the model to be used. 

The most general operators we have considered applying in this chapter 
are given by equations 13.22. For extinction, 7 = 75 = 0, and so we 
have left only the two operators, Q;» and Qo», specified by 


(13.48) Qiop = ap, 

Qssp = «sp + (1 — 23). 
The first equation is appropriate when the originally favorable side is 
Chosen and the second when the other choice is made. In Section 13.8 
We took the non-reward parameter æ to be unity and thereby succeeded in 
fitting the acquisition data adequately. However, if we took a = 1 for 
extinction, we would conclude that extinction would never occur! Both 
Operators Qiz and Q,, would be identity operators, and p would never 
change, The data cannot stand such a drastic assumption, and so we are 
forced to assume that extinction and acquisition are just different. We 
Say More about this point in the next section, but first we estimate the 
*xtinction parameters a, from the data. 
Procedure for estimating æ can be obtained directly from the total 
ber of choices of the favorable side above chance. We need an ex- 
Pression for the means V,,. The operators defined by equations 13.48 
are those discussed in Section 5.7. Hence we use equations 5.31, 5.57, 
and 5.58 to obtain the expression 


num 


Vis =E (Aa 1/2)(2%2 — 1". 


The asymptotic mean, Vive is 1/2. The expected number of choices of 


,00? 


304 SYMMETRIC CHOICE PROBLEMS cH. 13 


the previously favorable side, above chance, is (V,,, — 1/2) on trial » 


for each subject. We then sum over all trials from 0 to K — | and get 
K-—1 K-1 
Z Vin — 1/2) = (4, — 1/2) X (22 — 1)" 
a-0 C n=0 

ams OA. — 12) 


K 
Hoey 60-1. 


This quantity can be estimated by the mean number D of extinction 
responses above chance on trials 0 to K — 1, that is, 


SED (7) , 
(13.51) paí xi - [1 — (2% — 1)5. 
- — ua 


This equation must be solved numerically for the estimate of x» except 
when K is large enough to allow us to neglect (2%, — 1), in which case 


^ (Vig — 1/2) 
(13.52) P E 


For extinction after 100 percent reinforcement, the initial mean Vio 
corresponds to the mean at the end of acquisition and so can be estimated 
from the acquisition data. For extinction after partial reinforcement, 
we do not know just when extinction begins from the point of view of the 
Subject, but we accept the experimenter’s definition of the beginning of 
extinction. In that case, V, o for the extinction portion of the data can 
be estimated from the acquisition data also, We now apply the foregoing 
estimation formulas to the two-armed bandit data. 

Table 13.7 summarizes the extinction data obtained from the Goodnow 
experiments, and Table 13.8 gives the corresponding data from the 
Robillard study. For each group we give the value of D obtained from 
the data, the value of Vio obtained from the acquisition data, and the 
computed estimates 2,. 

The estimates from the Goodnow data show that extinction occurs 
most rapidly after those training schedules in which one side was rewarded 
100 percent of the trials. For the 100 : 0 play-free group and the 100 : 50 
group, extinction was essentially "complete" at the end of the first block 
of 10 trials. As a result, we were not able to obtain useful estimates of 
2» from those data, but it seems clear that the values are less than say 0.9- 
The group most resistant to extinction was the 50:0 pay-to-play group: 
it extinguished more slowly than the 50 : 0 play-free group, and both 
50 : 0 groups extinguished more slowly than the 75 : 25 group. When 
only the pay-to-play data are considered, the order of incre 
to extinction is 100 : 0, 75 : 25, and 50 : 0—this is the ord 
frequency of reward just prior to extinction. 


asing resistance 
er of decreasing 


SEC. 13.10 EXTINCTION DATA 305 


The Robillard extinction data given in Table 13.8 lead to similar 
inferences regarding resistance to extinction after partial reinforcement 
Schedules. For the moment consider only those groups for which reward 
occurred on one side only and for which the chips had no money value, 
namely, the 80 : 0, 50 : 0, and 30 : 0 zero-cent groups. The order of 
increasing resistance to extinction, as measured by the estimates &, is 
80 : 0, 50 : 0, 30 : 0, although the difference between the first two groups 


TABLE 13.7 
Proportions of choices of the previously favorable side during extinction in the 
Goodnow “two-armed bandit" experiment. The quantity D is the mean 
number of extinction responses above chance on all 50 trials. The estimates 
of Vi o were Obtained from the computed value of Vi.» for the end of acquisition, 
and the estimates of «, were obtained from equation 13.51. 


100:0  100:0 50:0 50:0  100:50 75:25 
Trial Pay to Play Pay to Play Pay to Pay to 
Block Play Free Play Free Play Play 
0-9 0.44 0.64 0.84 0.62 0.44 0.64 
10-19 | 0.54 0.42 0.74 0.66 0.50 0.50 
20-29 | 0.54 0.48 0.76 0.64 0.42 0.66 
30-39 | 0.44 0.44 0.62 0.54 0.52 0.46 
40-49 | 0.62 0.50 0.62 0.54 0.58 0.54 
B 0.8 0.2 10.8 5.0 -04 3.0 
Kio 1.000 0.972 0.945 0.828 0.995 0.922 
dy 0.688 ib 0.979 0.967 a 0.930 


I$ very slight. The 80 : 40 and 60 : 30 groups extinguished more rapidly 
an any of the other groups. f 
iip. general conclusion about resistance to extinction after partial 

Orcement, inferred from the Goodnow and Robillard data, is in 
agreement with the results of a number of previous studies reviewed by 
“enkins and Stanley [10]. Most of those studies employed experimental 
arrangements quite different from the symmetric two-choice situation, 
ang the conclusion seems to be a rather general one: resistance to ex- 
tinction increases monotonically with decreasing frequency of reward in 

* Preceding acquisition training. : 
Ne other inference may be made from the estimates of a, for the 
Sbillard extinction data. It may be noted from Table 13.8 that of the 


306 SYMMETRIC CHOICE PROBLEMS cH. 13 


three 50 : 0 groups, the zero-cent group extinguished most rapidly and 
the five-cent group most slowly, with the one-cent group being inter- 
mediate. In Section 13.8 we noted that the zero-cent group learned 
most rapidly and the five-cent group most slowly. Therefore, our 
analysis of the data seems to show that increasing the amount of reward 
slows down both acquisition and extinction. 


TABLE 13.8 


Proportions of choices of the previously favorable side during extinction in the 

Robillard experiment. The quantity D is the mean number of extinction 

responses above chance on all 50 trials. The estimates of V} were obtained 

from the computed value of V4, for the end of acquisition, and the estimates of 
2» were obtained from equation 13.51. 


A 50:0 50:0 50:0 30:0 80:0 80:40 60:30 
Trial Block |") Q6 (sb OP (o) (o) (5 


0-9 0.71 0.64 0.78 0.66 0.69 057 0.52 

10-19 0.70 | 0.68 0.61 062 0.65 061 053 

20-29 0.57 0.689 0.75 0.68 0.68 0.54 0.55 

30-34 0.62 0.80 — 0.74 0.74 0.76 0.48 0.64 
D 54 5.5 7.6 5.8 6.5 2.1 1.7 
Pio 0.891 0.825 0.794 0.729 0.990 0.731 0.681 
&, 0.967 


0.976 0.991 0.990 0,965 0.946 0.948 


13.11 COMPARISONS AND EVALUATIONS 


l In ihe preceding sections, data on behavior in symmetric choice 
situations have been analyzed. Asa result, we are now in a position to 


make a number of comparisons and re-examine the usefulness of the 
general model. 


The data from several experiments that used the non-contingent 


procedure were found to agree with a prediction obtained from a mode 
based upon experimenter-controlled events and the equal alpha condition. 


This model was found satisfactory for data obtained in several different 
laboratories and with various physical 
choice behavior. In addition, the generalized model was equally satis- 
factory when compared to Neimark's three-choice non-contingent data- 
In some of the two-choice and three-choice non-contingent experiments 
of Neimark, groups of subjects were run on Schedules which included 
"blank" trials (trials on which no light appeared). By assuming that 


arrangements for studying tW0- 


SEC. 13.11 COMPARISONS AND EVALUATIONS 307 


these trials did not alter the response probabilities, we again found 
agreement between the data and the equal-alpha experimenter-controlled 
event model. The large variety of data which agrees with this model is 
impressive. 

Data from numerous experiments which used the contingent procedure 
of reinforcement present a somewhat different picture. We found no 
single model adequate for handling all these data. Most of the contingent 
two-choice data appears consistent with the identity operator assumption 
for non-reward, but two exceptions exist: the Neimark 66 : 34 data and 
the Detambel 50 : 0 data. In those two cases, an equal alpha assumption 
Seems more tenable. The three-choice contingent data obtained by 
Neimark presents an even more serious problem; no assumptions con- 
Sidered led to a satisfactory model. Procedural and instructional differ- 
ences among the various contingent studies may be responsible for the 
different results obtained, but such differences apparently had no appreci- - 
able effect in the non-contingent studies. In spite of the difficulties just 
mentioned we can make a number of inferences. 

The studies of rats and human beings in similar situations give us an 
Opportunity for comparing rats and people, if anyone is interested in 
Such comparisons. In terms of the estimates of the success parameters o 
We can say that on the whole Stanley's rats learned faster than or about as 
fast as high-school and college students in the 50 : 0 and 75 : 25 situations, 
but that the students learned. much faster than the rats in the 100 : 0 
Situation, It is difficult to equate rats and people on such variables as 
amount of motivation or incentive, but we mention these comparisons 
for what they are worth. Differences between experimental procedures 
Should also be emphasized. 

The Goodnow data, given in Table 13.4, show that Harvard students 
learn faster in the 100 : 0 and 50 : 0 situations when they are required to 
Pay to play rather than play free. (We hesitate to propose that these 
results have implications for educational policies.) On the other hand, 
the Robillard data in Table 13.5 show that increasing the amount of 
Teward per trial tends to decrease the rate of learning. This effect may 
Not be borne out by further experimental work (the differences are 
Clearly insignificant), but the Robillard data do demonstrate that the 


amount of reward is not a major factor in this kind of learning experiment. 
e made in terms of the relative 


II the comparisons just mentioned wer rel 
The comparisons are trivial to 


gated values of the parameter %. Th 
ake once we have such a summary statistic of the data. 
In Section 13.10 we analyzed some data on extinction and found results 
Consistent with other data on extinction after partial reinforcement, 
"viewed by Jenkins and Stanley [10]. We found that the values of &; 


308 SYMMETRIC CHOICE PROBLEMS CH. 13 


were indeed not unity for extinction, as was approximately the case for 
acquisition in the same experiments, and that the extinction values of he 
increased with decreasing frequency of reward during the preceding 
training. 

For a long time we tried to hold the position that an empirical event 
such as “turning right and finding food" or “turning right and not finding 
food” had associated with it a unique operator whose parameters were 
independent of the reinforcement schedule bein 
We liked this principle of “event invariance” 
psychological evidence or intuition to suppor: 
provided a rather nice parsimony in the model 


model event and empirical event could be ma 
series of experiments. 
have the s 


g used at the moment. 
not because we had good 
t it but rather because it 
—the identification between 
de once and for all in a given 
For example, the non-reward parameter o, would 
ame value in 100 percent reinforcement, 30 percent reinforce- 
ment, and in extinction for a particular apparatus, type of organism, 
strength of drive, etc. But we have been forced to abandon this view in 
light of the experimental evidence. The two-armed bandit extinction 
data clearly show that *» gets nearer to unity the smaller the reward 


probability in the preceding acquisition training when reward occurred 
on only one side. 


a described here require that «s be 
quisition. If we were to cling to our 


he probability at the beginning 
© the acquisition asymptote. These de- 
i the experimental findings and 
SS parsimonious view of event 
appropriate event identification 


à broader and le 
identification. We conclude that the 
does not simply involve "turning right and finding no food" but rather is 
"turning right and finding no food, given a particular psychological set 
or expectancy." This conclusion will be no Surprise to those who are 
aware of the large body of experimental work on the effects of set in 
influencing behavior. Nevertheless, it seems worthwhile to record our 
unsuccessful attempt to avoid this complication. 


13.12 SUMMARY 
Data from experiments on rats and huma 


al s I n beings in symmetric choice 
situations are analyzed in this chapter. 


The reference experiments are 


REFERENCES 309 


those by Humphreys [3], in which subjects predicted whether or not a 
light would come on during each trial, and by Brunswik [1], in which rats 
were run in a simple T-maze. A distinction is made between two experi- 
mental procedures, called contingent and non-contingent. With the 
contingent technique, the probabilities of certain kinds of environmental 
changes depend upon the subject's choice, whereas in the non-contingent 
experiments, those changes occur independent of the subject’s choice. 

A model with experimenter-controlled events and equal alphas is 
Presented for analyzing data from non-contingent experiments. All 
the data examined agree with the model prediction that the asymptotic 
Proportion of A, responses is equal to the probability of occurrence of 
event EF. For the contingent experiments, a model using experimenter- 
Subject-controlled events is developed. Most of the data is consistent 
with the assumption that the operators associated with non-reward are 
identity operators. 

From estimates of the alphas obtained from various sets of data, 
Comparisons of several kinds are made: rats with people, paying to play 
with playing free, small reward with larger reward, and college students 
With high-school students. Some data on extinction are also analyzed. 


REFERENCES 

1. Brunswik, E, Probability as a determiner of rat behavior. J. exp. Psychol., 1939, 
25, 175-197, 

: Stanley, J, C., Jr. The differential effects of partial and continuous reward upon the 
acquisition and elimination of a runway response in a two-choice situation. Ed. 
D. Thesis, Harvard University, 1950. 

. Humphreys, L.G. Acquisition and extinction of verbal expectations in a situation 
analogous to conditioning. J. exp. Psychol., 1939, 25, 294-301. 

* Grant, D, A., Hake, H. W., and Hornseth, J. P. Acquisition and extinction of a 
verbal conditioned response with differing percentages of reinforcement. J. exp. 
Psychol., 1951, 42, 1-5. . 

< Grant, D, A., Hornseth, J. P., and Hake, H. W. The influence of the inter-trial 
interval on the Humphreys’ “random reinforcement" effect during the extinction 
Of a verbal response. J. exp. Psychol., 1950, 40, 609-612. f 

*Jarvik, M. p, Probability learning and a negative recency effect in the serial 

anticipation of alternative symbols. J. exp. Psychol., 1951, 41, 291-297. 
ake, H. W., and Hyman, R. Perception of the statistical structure of a random 

$ Series of binary symbols. J. exp. Psychol., 1953, 45, 64-74. uu 
. Detambel, M.H. A re-analysis of Humphreys? "acquisition and extinction of verbal 

expectations.” M.A. Thesis, Indiana University, 1950. i 

Neimark, E. D. Effects of type of non-reinforcement and number of alternative 

ee in two verbal conditioning situations. Ph.D. Thesis, Indiana University, 
53. 

* Jenkins, w, O., and Stanley, J. C., Jr. Partial reinforcement: a review and critique. 


Psychol, Bull., 1950, 47, 193-234. 


CHAPTER 14 


Runway Experiments 


144 THE EXPERIMENTS 


One of the simplest demonstrations of animal learning is found in the 
runway experiment reported by Graham and Gagné [1]. A hungry rat 
is placed at one end of a straight alley and is allowed to run to the other 
end to obtain food. Times spent by the rat in various portions of the 
alley are recorded on each of a series of trials, and it is found that the 
times decrease very rapidly at first and then tend to stabilize at some 
minimum value. In this chapter we develop a model for analyzing data 
from such experiments. 

In the Graham-Gagné experiments, the apparatus consisted of a starting 
box and a food box connected by a Straight alley 3 feet long. The length 
of time the rat remained in the starting box was called the latency, and 
the time consumed in traversing the alley was called the running time. A 
group of 21 albino rats was run through 15 trials of acquisition followed 
by 5 trials of extinction. 

A more extensive runway experiment has been carried out by Weinstock 
[2]. The basic design of the apparatus was the same as that used by 
Graham and Gagné, but 3 times were recorded; the latency was defined as 
the time required to pass a point 6 inches from the starting end, and the 
running times to points 18 inches and 30 inches from the starting end were 
also recorded by photo-cells and timers. Acquisition consisted of 108 
trials, with only one trial per rat per day. In addition to the 100 percent 
group of 23 animals, Weinstock ran 5 partial reinforcement groups: 
Extinction was studied in all groups. 

We analyze in some detail the latency data obtained by Weinstock on 
the 100 percent group. We are indebted to him for making the original 
data available to us. The model attempts to handle not only the mean 
latencies from trial to trial but also the distributions of latencies. We 
consider fitting a theoretical curve to the mean latencies a minimal 
program which uses only a small part of the information in the data. 


310 


SEC. 14.2 IDENTIFICATION PROBLEMS 311 


14.2 IDENTIFICATION PROBLEMS 


From the general model described in Part I, we wish to construct a 
specific model for the runway experiment. As in preceding chapters we 
need to make identifications between elements in the general model and 
features of the experiment being described, but the runway presents some 
Special identification problems. What shall we identify with alternatives 
A, and 45? The runway does not appear to be a choice situation at all; 
the rat does not choose right or left alleys. On each experimental trial 
a rat runs from the starting position to the food box, for a trial is so defined 
by the experimenter. Our task is to construct a model for the runway by 
creating a choice situation in our mind even though there may not be one 
in the rat's. 

There are several ways to view the runway as a choice situation. For 
example, we could postulate that the rat "chooses" a reaction time or 
latency from a distribution of times. We would need to allow many 
possible values of latency (perhaps an infinite number) and so we would 
have a very large number of alternatives A,, Ag, ***, A, Another 
possibility leading to a simpler model is to choose an arbitrary time 7 and 
to define A, as the occurrence of a latency less than 7 and A, as the occur- 
rence of a latency greater than or equal to 7. These definitions would 
reduce the data to 0’s and 1’s, similar to data from other experiments, 
but we would lose all the fine detail of the time measurements. Still 
another possibility is to postulate that two distributions of reaction times 
are available to the rat and that on each trial a random sample is drawn 
from one distribution or the other; A, would be identified with drawing 
a sample from the distribution with small mean latency and Ay with 
drawing a sample from the other. If this were the model, the rat would 
gradually reduce his latencies by choosing his latency more and more 
Often from the distribution with the smaller reaction times. This double- 
distribution model has its attractions because it would help describe some 
of the extremely large observations that occur in latency data. The 
disadvantage stems from the failure to describe the gradual decrease in 
latencies after many trials. This disadvantage could be overcome by 
Introducing two sequences of distributions, members of the sequences 
Corresponding to trials. 

The models just mentioned have one feature in common: one of a set 
of alternatives occurs on an experimental trial. Operationally, we would 
Observe a latency and assert that A, or Ae occurred (or 4s, Ay, `` °), de- 
Pending on the value of the latency and the criterion being used. The 
latency data from trial to trial would thereby be translated into a sequence 
Of As for each rat. The important point is that an experimental trial 


312 RUNWAY EXPERIMENTS cH. 14 


corresponds to a trial as defined in Chapter ]—an opportunity for SUM 
among r alternatives—and those alternatives correspond to values o 
latencies. In spite of this advantage we do not choose such models. 
A rather different set of models can be generated by assuming that a 
sequence of A,’s and A's occurs on each experimental trial. If we rather 
loosely identified 4, with a “goal-directed movement" and 4, with all 
other behavior, we would conceive that an experimental trial consists of 
a series of movements, some goal directed and some not. The model 
originally proposed by Estes [3] and discussed by Bush and Mosteller [4] 
was essentially of this type. It was assumed that an experimental trial 
was terminated by the first occurrence of A, and that it was preceded by 
a certain number of Aa's. If it takes more than one goal-directed response 
to complete a trial, it is no trouble to generalize the model to require some 
fixed number k of A,’s for trial termination. The basic difference between 


such models and the ones Previously mentioned is that an experimental 


trial now consists of a series of steps, and a choice between 4, and Ay 


occurs at each such step. Asa result, 4, and Ay correspond to overt acts 
or movements rather than the selection of a latency value. In this chapter 
we use this class of models. 

Two main models of the type just described are developed in the fol- 
lowing two sections. In both we assume that an experimental trial consists 
of a sequence of A,’s and A's and that the trial is terminated when the 
kth A, occurs. The number of A, occurrences varies during the course 
of learning. The fundamental difference between these two models lies 
in the assumptions about the times required for the A, and A, occurrences. 
The early model described by Estes and by Bush and Mosteller assumed 
that each act required a constant time h; the model developed in Section 

i Clearly, the notion of a constant time / 

an: i est leads to an adequate approximation. 

I ! Which are integral multiples of /r, but if / 
is sufficiently small we would not be Seriously worried about this apparent 
absurdity. The model described in Section 14.4 involves a different 
assumption, namely, that the ti i 1 Or As act is a random 
This model leads to à 


We first discuss a model in whi 


SEC. 14.3 A MODEL WITH DISCRETE TIMES 313 


a goal-directed movement, and & such acts are required to bring the rat 
from the starting position to the point at which the latency is clocked. 
Interspersed among the k occurrences of A, are a number of A, occurrences. 
If a total of N acts of type A, or Ag occur, the latency L is 


(14.1) L= Nh. 


Now on experimental trial 7, suppose that there is a constant probability 
Pn that A, occurs during an interval of length h. In terms of this prob- 
ability p, We wish to compute the probability P(N,) that precisely N, 
responses occur before k goal-directed responses have taken place. 
ee probability is given by the well-known negative binomial distribution 
5, p. 61]: 


ax. P(N,) = ys il p, — py)", 
where 
N,—1 (NN, — Ty 
(14.3 n ae ee — 
^ Les l ) (N, —ANK— I)! 


is a binomial coefficient. We are concerned with the mean latency, and 
SO We want an expression for the expected value of N, in terms of p, and 


k. Itcan be shown that 


(14.4) E(N,) = klp,. 
It follows that the expected value of the latency L, is 
(14.5) E(L,) = hEQN,) = hkjp,. 


This expression for the mean latency involves the time interval / and the 
necessary number k of A, occurrences only as a product hk. We denote 
this product by 

(14.6) Lo = hk, 


and the expected latency can be written as 
(14.7) E(L,) = Elpa 
When analyzing latency data we could consider L? a parameter to be 
estimated from the data. 
Any good model for the runway should be able to reproduce the mean 
latencies reasonably well, but we consider this a minimal requirement 
cause curve-fitting methods are adequate for this purpose. Estes, in 
his early paper [3], fitted the means of the Graham-Gagné data as did 


Graham and Gagné [1]; we have done the same with Weinstock’s data, 


though we do not present the results here. But to show its worth in 


analyzing such data, a model should reproduce the general character of 


314 RUNWAY EXPERIMENTS CH. 14 


the whole distributions of latencies. In Fig. 14.1 we show an example of 
the distribution function defined by equation 14.2. An important 
property of this distribution is its variance given by the formula 


l — p. 
(14.8) o°(N,) = k———. 
The variance of the latencies L, is h? times this variance: 
L— 7. los ei 
2 aig ni I aq 
(14.9) o%(L,) = k Fu k V. 
From this expression we find that the variance of the latencies is zero * 
when p, — 1. This result is intuitively obvious since when Pn= 1 an 
0.14 
0.12 
0.10 
0.08 
P(Nn) 
0.06 
0.04 
0.02 
0 


0 2 4 6 8 10 12 14 16 18 20 22 24 
Nn 


Fig. 14.1. An example of the ne 


gative binomial distribution of equation 14.2. 
The values k = 5 and Pn 


= 0.5 were used in constructing the figure. 


Ay is certain to occur during each interval of length A, and so the latency 
has the constant value hk. This conclusion creates a rather serious 
objection to the model described in this section: the model predicts that 
there will be no variability in latency when learning is complete provided 
that we assume that p,, approaches an asymptote of unity. If we find 
experimentally that the latency distributions Stabilize a 
but that the variance is appreciable, we must either assume that A, the 
asymptote of p,, is not unity or admit that the model has failed to describe 
this detail of the data. There is no a priori reason for insisting that A= 1, 
but we feel that this assumption has some intuitive appeal. Observed 
reaction times are notoriously variable, and we object to ascribing 
deviations from split-second precision to imperfect learning (2 4 1)- 
This objection, weak as it may appear to some readers, in addition to the 
more obvious objection that this model permits only discrete values of 


latencics, led us to develop the continuous model presented in the following 
section. 


fter many trials 


SEC. 14.4 A MODEL WITH CONTINUOUS TIMES 315 


14.4 A MODEL WITH CONTINUOUS TIMES 


The runway model given in the preceding section assumed that each 4, 
and A, response requires a fixed time of length h. A generalization of this 
model may be made as follows. Let the time required for an A, act be 
1; and assume that 1, is some constant /, plus a random variable 7: 


(14.10) t, h d. 
Similarly, let the time required for an A, act be 
(14.11) ty = Np + Ta, 


where A, is a constant, not necessarily equal to A, and where 7g is a 
random variable. When p, = 1, an experimental trial consists of pre- 
cisely k acts of type A}. The latency in this case is k/, plus the sum of k 
random variables 7}, and thus the variance of the asymptotic latencies is 
"Unrestricted—it depends upon the assumed distribution of the random 
Variable 7,. 

We wish to provide a model which specifies the precise form of the 
distributions of latencies, and so we must make some further assumptions. 
We could merely fit the data with a sensible distribution function, but 
we prefer to derive the distribution of latencies from a simple assumption 
about the distributions of the random variables 7, and 7ẹ We assume 
that both 7, and 7; are distributed according to the function 


(14.12) f(r) = se. 


Several plausibility arguments can be made in favor of such a distribution 
9f 7; an analogous assumption is made in many time problems such as in 
the theory of Geiger-Müller counters. This distribution may be derived 
from à pseudo-neurological model as follows. Consider 7 to be a reaction 
time which follows the “decision” to make an A, act or an A, act. A 
decision is made, and 7 seconds later the act begins. Then suppose à 
Deuron in the nervous system of the organism fires every 6 seconds but 
that the act begins only when a neuron of an appropriate class fires. 
Let = be the probability that a neuron of the appropriate class fires. The 
Probability that this will occur after precisely j — 1 inappropriate neurons 
have fired is 


(14.13) P(j)= 0 — 7). 


e reaction time will be jd seconds. 


This, then, is the probability that th 
hingly small. But we 


We now wish to go to the limit as 5 becomes vanis 


316 RUNWAY EXPERIMENTS cu. 14 


will take the limit while maintaining constant 7/ô, the probability per 
second of the firing of an appropriate neuron. We let 


(14.14) aes Jo 
and 
(14.15) s= 7/06. 


Thus, equation 14.13 may be written 


sö 
(14.16) Bis 6 onc sapi 
1 — sd 


Before going to the limit as 6 — 0, we replace P(7) with f(7)Av, where 
Ar = 6. We do this because the discrete distribution is being replaced 
by a continuous one, so the density P(7) is being approximated by the 
ordinate f(7) of a continuous distribution times the width of the interval 
Ar. Thus we have 


14.17 eU T M yr? 
ar fir) = rl — say 


We next expand the right-hand factor: 


(1 — sôy = 1 (7/0)58 4 Aep = 


(sô)? 
(14.18) NE e D cip a. 


ei N (st\(st — sô) (sz)(s7 — só)sr — 2sô) — 
2! 3! 
When we then take the limit as 6 — 0 the te 
expansion on the right, and we get 


rms involving sd vanish in the 


(14.19) (1—s5y ps CF GF 
2! 3E ^ 
— 0, we have from equation 14.17 


emper 
Therefore, as 6 


(14.20) Ji) ges, 


in agreement with equation 14.12. We do not consider this pseudo- 
neurological argument essential to the model being developed in this 
section. Rather, we include it for whatever plausibility value it may have- 

We assume that both 7, and 7, of equations 14.10 and 14.11 are distri- 
buted according to f(7) given by equation 14.12. 
that the constant Ag is zero, that is, that A 
for their execution but do involve a reacti 


Furthermore, we assume 
2 responses do not require time 
on time z, This suggests that 


SEC. 14.4 A MODEL WITH CONTINUOUS TIMES 317 


we consider A, responses to be “doing nothing" or "standing still." The 
total latency on an experimental trial, then, is &/ plus the sum of N 
random variables 7 from the distribution f(z). The number N will also 
be a random variable distributed according to the negative binomial of 
equation 14.2. But first we need to compute the distribution of the sum 
of N independent random observations from f(z) for a given value of N. 
If we denote the sum by t, and the distribution function by gy(ty), it is 
well known that 

(14.21) gts) ccm e(t 71. 
This expression is the gamma distribution [5, p. 112] and is readily derived 
from fir) by the method of moment generating functions. 


The probability of a given value of N is 


(N—D! 
(k —DXN — k)! 


(14.22) P(N)— a py, 

Where p is the probability of an A, response at each step in the process 
during an experimental trial. To obtain the unconditional distribution 
of t we multiply gy by P(N) to get the joint distribution of N and 1, and 
then sum over values of N from k to co: 


g() = X gs(0PQN) 
Nek 
se-*!p* x (st) (1 — p)*-* 


PU C-DA NA 


set p'(st)? S [so — p^ 
~“ E £z WH! 
Ne 


The summation has terms of the form z"/n! summed from 0 to oo with 
* = st(1 — p), that is, the series expansion of e”. Therefore 


— à k-1 
se *'p'(st) gian, 


Mint &0— Ey 


Re-arranging terms, we get finally 


sp Ry 
(14.25) et) = ei (spe 


—spt 


318 RUNWAY EXPERIMENTS cH. 14 


The latency on an experimental trial is t plus kh}, the minimum possible 
time. We let 


(14.26) e= kh, 
and 
(14.27) L=tre. 


Thus the distribution of the latencies L is 
sp TONO 
L)— pl oero — Le 
(14.28) $0 E? 
=0 Dx. 


An example of this distribution function is shown in Fig. 14.2. 
0.20 


It is the 


0 
28 


30 


— 
32 34 36 38 40 42 44 
L 


Fig.14.2. Anexample of the gamma distribution 
k = 5, sp = 1, and c = 30 were used in 


46 

of equation 14.28. The values 

constructing the figure. 

generalized gamma distribution. It can be shown that the mean is 

(14.29) ED c4 E, 
ES 

and the variance is 


LI 
(14.30) yes E. 


(spy 


SEC. 14.5 ESTIMATION OF PARAMETERS 319 


We use the distribution function 4(L) defined by equation 14.28 to analyze 
some of the data obtained by Weinstock [2]. Before doing this we con- 
sider the problem of estimating the parameters of ¢(L) from data. 


14.5 ESTIMATION OF PARAMETERS OF THE 
ASYMPTOTIC DISTRIBUTION 


We attempt to use the gamma distribution of equation 14.28 to describe 
the observed asymptotic distribution of latencies. We assume that 
P= 1 during the latter trials in the experiment, and we wish to estimate 
5, c, and k from these data. However, we do not assume that the zero 
Point c and the scale factor s are the same for all animals, but we do assume 
that the number k is the same for all animals in the experiment. The 
Subscript i= 1, 2, -++ is used to distinguish the animals, and the subscript 
n will be used for trials as usual. The latency of the ith animal on trial n 
Is then Li. Thus, from equation 14.28 we write for p — 1, 


S: 
Lys D (La — ee m7 Lin 6 
asy P9 = Ga pi Pm a 


=0 Lin € Ce 


We wish to estimate the parameter k for a group of animals and the 


Parameters c, and s; for each animal. 

Various properties of the distribution can be used to estimate the 
Parameters; one easy procedure is to use the moments. Equations 
14.29 and 14.30 give the mean and variance, which could be used along 
with a third property to carry out the estimation. For example, we 
Could estimate the parameter c; by the smallest latency observed for the 
ith animal, but this estimate clearly has à positive bias. The objection 
to using the distribution moments for the estimation is that latency data 
frequently contain a small percentage of exceedingly large observations 
Which may result from uncontrolled or uncontrollable variations 1n 
€xperimental conditions; the moments are very sensitive to large ob- 
Servations. In the next section we present an example of such data. 

The maximum likelihood procedure can be used to estimate the para- 
Meters of the distribution function given by equation 14.31. It can be 
Shown that this procedure leads to expressions. which involve three 
Measures of location (cf.[7]): the arithmetic mean, the harmonic mean, 
and the geometric mean. The last two means are rather difficult to 
Compute, as they involve the unknown parameters Ci, and so we do not 
Propose this procedure, though we have carried it through. 

An estimation procedure we have found relatively simple to carry out, 
and one which is not sensitive to Very large observations, makes use of 


cH. 14 
RUNWAY EXPERIMENTS 
320 


percentage points of the cumulative distribution. 


We first transform the 
distribution function of equation 14.31 by letting 


(14.32) Vin = S(Lin — c;). 


We then obtain for the distribution of Tim the gamma distribution: 


x = oe em, Vin = 0, 
(14.33) (k — 1)! 


=0, Zin <0. 
The percentage points are then defined by 


(14.34) Í Pin) de; = F(u). 
0 


This integral, which is related to the incomplete gamma function, has been 
extensively tabulated [6]. The 25th percentile, w25, for example, is the 


upper limit of the integral which corresponds to F(u) = 0.25. Percentiles 
to be used in later sections are shown in Table 14.1. 


TABLE 14.1 


Percentage points of the gamma distribution for 
several values of k, and values of the ratio p. 


las — Moz 
k D los us p cpm on 
a JL 
3 1.10 1.73 3.92 3.48 
4 1:75 2:53 5.11 3.31 
$ 2.44 3.37 6.27 3.12 
6 3.15 4.22 7.42 2.99 
a 3.90 5.08 8.56 2.95 
The corresponding Percentage points U of the distribution of Lin are 
obtained from equation 14.32: 


u 
(14.35) ei deg. 


5; 


As shown in Table 14.1, the ws are known functions of k, and so the U’s 
are functions of k, 5, and cj Hence three percentage points from the 
observed latency distribution for a single animal are sufficient to estimate 
the three parameters. 


First consider estimating the parameter k. Let U', U", and U" be 


SEC. 14.6 ANALYSIS OF THE WEINSTOCK DATA 321 


three observed percentage points such that U'— U” > U". From 
equation 14.35 we see that 


U' -U" & . 
5 
(14.36) 
u” — u” 
u"— u"a . 
Si 
and so 
r y" r 
(14.37) U U" wu i — 


U"—U" w"—uy 


Some values of the ratio p are shown in Table 14.1 for several values of &. 
The observed ratio can then be used to estimate k. 

Once k has been estimated, the corresponding values of the u's can be 
obtained from Table 14.1. These, then, can be used in equation 14.35 
to estimate s; and c; In the next section we illustrate the procedure in 
detail. 


14.6 ANALYSIS OF THE WEINSTOCK DATA 


We now analyze the latency data from Weinstock’s 100 percent reinforce- 
ment group. Twenty-three rats were run through 108 trials of acquisition; 
the latency on the first extinction trial reflects the previous reward training, 
and so the data on 109 trials for each animal are used. First we estimate 
the parameters of the asymptotic distributions by using the data from the 
last 40 trials; for the analysis, it is assumed that p = 1 for these trials. 

As mentioned in the preceding section, latency data often contain a 
small percentage of extremely large observations. Preliminary analyses 
of the Weinstock data made it evident that the models described in this 
chapter could not account for these large observations while at the same 
time describing the major portion of the data. In the next section we 
discuss the problem of handling the large observations, but in this section 
We are not concerned with them. To estimate parameters of the model, 
We choose three percentiles—the 10th, 25th, and 75th—which are relatively 
sensitive to the extreme tail of the distribution. From the data on the 
last 40 trials we obtain these percentage points for each animal and then 
compute the ratio 


(14.38) j= 
25^ “I0 


In Table 14.2 we show the results. The mean value of the p's for the 


animals is 3.148; in Table 14.1 we see that this is closest to the value 
of p for k= 5, Further computations are simplified if k is an integer. 


322. RUNWAY EXPERIMENTS cH. 14 


Hence we take as our estimate of k, the number of goal-directed acts, 
(14.39) k 5. 


Using this estimate of k, we next estimate s; and c; for each animal from 


TABLE 14.2 


Data used for estimating k for the group of rats and s; 
and c; for each rat. The observed percentage points are 
Uio, Uys, and U;,. The ratio of equation 14.38 is p, and 
the estimates from equations 14.41 are $; and ĉ;. 


Rat Up Us | Us D $, 6 
1 29.7 303 33.1 4.67 1.04 27.0 
2 31.9 329 370 410 0.71 28.1 
3 384 40.1 42.9 1.65 1.04 36.8 
4 329 348 41.6 3.58 0,43 26.9 
5 362 379 411 1.88 0.91 34.2 
6 33.T 85.7 45.5 3.7] 0.30 24.3 
7 32.8 33.8 38.1 4.30 0.67 28.8 
8 34.8 35.8 38.8 3.00 0.97 32.3 
9 37.0 383 428 3.446 0.64 33.1 

10 360 37.5 447 480 0.440 29.1 
11 302 2317 33.8 1.40 1.38 29.3 
12 39.3 40.2 43.1 3.22 1.00 36.8 
13 34.5 360 388 1.87 1.04 327 
14 340 353 39.2 3.00 0.74 30.8 
15 33.6 35.8 401 1.95 0.67 308 
16 360 369 39.9 33 0.97 334 
17 38.9 408 45.5 247 0.60 353 
18 340 35.9 383 1.26 1.21 33.1 
19 31.8 .33.0 373 3.88 0.67 280 
20 31.9 32:8 37.5 522 0.62 27.3 
21 446 472 554 3.00 0.37 38.0 
22 34.1 35.5 40.0 3.21 0.64 30.3 
23 315 344 43.6 3.65 0.31 23.1 


the observed 25th and 75th 


percentiles. From equation 14.35 and Table 
14.1 we obtain for k = 5, 


(14.40) a 


SEC. 14.6 ANALYSIS OF THE WEINSTOCK DATA 323 


Solving these for s; and c;, we have 


— 
Pie de. 


ta 


(14.41) 
€; = Us — 2.16(U;; — U»). 

These equations lead to the estimates shown in Table 14.2. 
Once having obtained estimates of s; and c; for each rat, the observed 

latencies can be transformed by equation 14.32. These transformed 
0.24 
0.22 
0.20 
0.18 
0.16 

~ 0.14 

Š oll 
0.10 
0.08 
0.06 
0.04 
0.02 


0 1 
-l1 012 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 
x 


Fig. 14.3, Distribution of transformed latencies for the last forty trials ‘of Weinstock’s 
Continuous reinforcement group. The theoretical distribution function was computed 
from equation 14.33, with k = 5. 


latencies can then be combined to yield an empirical distribution for all 
animals on the last 40 trials. This distribution corresponds to the theoret- 
ical distribution given by equation 14.33. In Fig. 14.3 we compare the 
two distributions. The agreement appears satisfactory over most of the 
range, but two things can be noted. First, a small number of negative 
values of 2 appear in the empirical distribution; this presumably is a 
Tesult of sampling errors in the estimates of the parameters. Second, 
the Observed large values of v are not predicted by the theoretical 
distribution, About 5 percent of the observed values of v are 13.5 or 
More whereas the area under the computed curve is about 0.3 percent. 

„The preceding analysis has been concerned only with the asymptotic 
distribution for which it was assumed that p = 1. We now investigate 
the distributions for blocks of trials and estimate p for each block. For 
estimation purposes we proceed as if p were constant within a block. 


324 RUNWAY EXPERIMENTS cH. 14 


The median (50th percentile) is used for this purpose. First, the median 
latency U;, of each rat is obtained for a particular block of trials. Then 
this latency is transformed by the equation 


(14.42) uzo = S(Ugo — c;). 
In this way, we obtain 23 estimates of u;, for a given block of trials. 
(When p = | and k = 5, uj, = 4.67.) The median of these 23 estimates 


is then determined; denote this median by m. The estimate of p for 
that block of trials is then 
(14.43) p= 4.67[m. 


In Table 14.3 we show the estimates obtained. The latencies were 
changing rapidly during the first five trials, and so we estimate p for each 


TABLE 14.3 
Estimates of p for blocks of trials 
obtained from equation 14.43 and 
values of p, computed from 
equation 14.46, with py — 0.014 


and x — 0.974, 

Trials p Pn 
0 0.014 0.014 
1 0.023 0.040 

2 0.022 0.065 
3 0.083 0.089 
4 0.079 0.113 
5-9 0.134 0.181 
10-19 0.246 0.328 
20-29 0.477 0.484 
30-39 0.603 0.604 
40-49 0.652 0.695 
50-59 0.682 0.766 
60-69 0.753 0.821 
70-79 1.006 0.862 
80-89 1.194 0.894 
90-99 0.905 0.918 
100-108 0.936 0.937 


of these trials separately. These estim 


We need to transform equation 14.28 
result 


ates are also given in Table me 
by equation 14.32; we obtain th 


P» ! 
14.44 Te = —— [p x, -le Patin 
(14.44) Yn) E- iia €^ Prin, 


SEC. 14.6 ANALYSIS OF THE WEINSTOCK DATA i 325 


From this equation, we can compute the theoretical distribution for each 
block of trials by using the estimated value of p,. However, we prefer 
to generate values of p, from the model and thereby fit the entire sequence 
of distributions. 

We assume that the reward that occurs on each trial of acquisition has 
an associated operator Q which is applied to p. Since we have already 
assumed that p — 1 asymptotically, we take 


(14.45) Op = «p + (1 — &). 
We further assume that no other events alter p and so we have 
(14.46) Pa = Q"py = &"Po + (1 — a"). 


From the latency data we need to estimate the two parameters, p, and g. 
(It is assumed that all animals have the same values of pọ and «.) We 
estimate p, from the data on trial 0. In Table 14.3 we see that 


(14.47) Pa = 0.014. 


The learning parameter « is estimated by the procedure used in Chapter 
13. Wesum Pp, from trial 0 to N — | and get 
1 1 
E pa= N= -— p) 


n-o L = % 


s 


(14.48) 


The sum on the left side of this equation is estimated by summing the 
estimates of p, given in Table 14.3. (An estimate for a block of trials is 
taken as the estimate for each trial in the block.) This sum is 74.50. 
Taking Po = 0.014 and N — 109, we can solve equation 14.48 numerically 
to get 


(14.49) & — 0.974. 


Using the above estimates of py and x we can compute the values of p, 
from equation 14.46; these are shown in the last column of Table 14.3. 
(The middle value of n in each block was used for the computation of p,.) 

The theoretical distribution for each block of trials was computed from 
equation 14.44 by using k = 5 and the values of p, just obtained. These 
distributions are compared with the distributions of transformed obser- 
vations in Fig. 14.4. In several trial blocks there are noticeable dis- 
Cre pancies between the observed and computed distributions, but we wish 
to reiterate that we have fitted the whole sequence of distributions rather 

an just the means or even the distributions one at 2 time. Sena 
mani can be done by using the block estimates p rather than the 

values p,. 


CTS — EATE eee EE d —— 


"€ yl QLL ut uox “d 


+ 
E jo sonje4 dy} pu? c = ¥ YUA ‘pp'p] uonenbo wos paurejqo aM SIAMI qjoous oup “TPL AVL 
Er Ut UMOYs '2 pue 's jo sojeuinsa oui pue zcv] uonenbo Sursn *eyep Mes ayy Suruiojsuen Aq paurmqo 
DOM SLUBITOISIY ou] SeN} Jo sx2o[q ua) 10j saua; pouuojsue) jo suonnquisiq ppi "Gu 
x 
OE oz or 0 
n 
5 
Z 
B 
S 
A 
u 
a 
E 
ux 
> 
< 
z 
z 
2 
[ 


326 


SEC. 14.7 CONCLUDING REMARKS 327 


A latency model should be concerned with the sequence of distributions 
and not primarily with obtaining close-fitting curves on single trials or 
separate blocks of trials. Nor is it enough for a stochastic model to fit 
Just one parameter of the sequence of distributions. The model might 
do this very well, and still not do justice to the variability or the general 
shape of the distribution. 

We do not proceed to make formal tests of significance for goodness-of- 
fit because it is obvious that there are consistently too many large obser- 
vations. This matter is discussed further in the next section. = 


14.7 CONCLUDING REMARKS 


In the analysis of the Weinstock latency data we were confronted with 
two serious problems. First, a casual inspection of the data shows 
marked differences among animals, and, second, a small percentage of 
extremely large observations was present. The between-animal differences 
Were handled by estimating for each animal two parameters, a zero point 
and a scale factor, and then making appropriate linear transformations 
On the raw observations to make animals commensurate. The problem 
of the large observations was essentially circumvented by selecting 
estimation procedures that are insensitive to them. The result of this 
Procedure is that the model gives a reasonably satisfactory description of 
the main portion of the data but does not properly predict the very large 
latencies, 

We can take at least two views about the large observed latencies. 

€ might argue that they arise from variations in experimental conditions 
Which are of no interest, that is, that they result from environmental 
events which are irrelevant to the learning phenomena being studied. 
Another Position, and probably a more tenable one, is that the large 
latencies are part of the data and cannot be dismissed. Accordingly, we 
Would argue that the model applied in this chapter is only a first approxi- 
ation. A more refined model would have a built-in mechanism for 
Senerating a small percentage of very large numbers. Some obvious 
Sources of large latencies are animals turning around in the apparatus or 
other activities that tend to cancel previous goal-directed movements. The 
Model We use in analyzing Weinstock’s data does not include this possi- 
bility of cancellation. 

Although it would be useful to develop a model that would fit the long 
tail as Well as the main body of a latency distribution, we do not do so here. 

uch an analysis would be very specific to latency problems, rather than a 
rect application of the general model given in Part I. Our purpose in 
this Chapter is to illustrate how the general model can be applied to time 


Problems. We make no pretense to completeness. 


328 RUNWAY EXPERIMENTS cH. 14 


14.8 SUMMARY 


Two models for describing latency and running time data from simple 
runway experiments are described in this chapter. The first model 
allows only discrete values of times. The second model involves a 
continuous distribution and so overcomes some of the objections to the 
discrete model. Both models assume that an experimental trial consists 
of a sequence of acts or movements belonging to two classes: those which 
are "goal-directed" and those which are not. The probability of a 
goal-directed act (4,) is transformed by an Operator Q on each trial when 
reward is given in the goal box. The continuous time model is used to 
analyze the distributions of latencies obtained by Weinstock [2]. The 
main portions of those distributions in the sequence 
described, but the model does not satisfactorily h 
values. The model, therefore, is re 
rather than an adequate description. 


are adequately 
andle the large observed 
garded as a first approximation 


REFERENCES 
1. Graham, C., and Gagné, R. M. The acquisition, extinction, and spontaneous 


recovery of a conditioned operant Fesponse. J. exp. Psychol., 1940, 26, 251-280. 


2. Weinstock, S. Resistance to extinction of a running response following partial 
reinforcement under widely spaced trials. 


J. comp. physiol. Psychol., 1954, 47, 
318-322. E . 
3. Estes, W. K. Toward a statistical theory of learning. Psychol. Rev., 1950, 57, 
94-107. 


4. Bush, R. R., and Mosteller, F. A mathematic; 
Rev., 1951, 58, 313-323. 

5. Mood, A. M. Introduction to the theory of statistics. 
1950. 

6. Pearson, K. (ed.) Tables of the incomplete T-function. London: His Majesty's 
Stationery Office, 1922. 


7. Fisher, R. A. Contributions to mathematical statist ic. N : Wiley, 1950, 
pp. 10.332-10.337. ws See YARED NUS 


al model for simple learning. Psychol. 


New York: McGraw-Hill, 


CHAPTER Is 
Evaluations 


15.1 PURPOSE OF THIS CHAPTER 


In Part I of this book we discussed some mathematical properties of a 
general model, and in Part II we describe several applications to experi- 
ments on learning. As a result we are now in a position to make some 
Critical evaluations. In the last five chapters we have repeatedly pointed 
Out the usefulness of the model for various purposes, but in this chapter 
We emphasize weaknesses and shortcomings of the model and indicate 
Some unsolved problems. In the final section, for an ideal situation, we 
Compare the use of the model with that of classical curve-fitting. 


15.2 MEASURES OF BEHAVIOR 


The fundamental measure of behavior used throughout this book is the 
Probability of occurrence of a given class of responses. In the general 
model we introduced a set of mutually exclusive and exhaustive alter- 
natives and assigned a probability measure to each. Such a starting point 
leads to Certain special difficulties in handling some kinds of data; indeed, 
Some readers may have felt in reading Chapter 1 that the general model 
Was a very restricted one. 

We are not greatly concerned about the requirement that the alter- 
natives be mutually exclusive, even though we can easily find observable 
Classes of behavior that occur simultaneously, for example, running and 

Teathing, We have already argued that the set of all possible behavior 
clemens can always be partitioned into classes which are disjunct by 
inode dPPropriae ed Marenver, ps ge ie S 

such a way that the set is exhaustive. E. 
alternatives leads to little trouble until we introduce the probability 
Measure on this set. For the probability measure to make sense in 
ith at ecm am one a 
3 at one and only one alterna 
“ny experiments, such as those discussed in Chapters 10 through 13, 


329 


330 EVALUATIONS cH. 15 


the trials were built into the experiment, but in Chapter 14, where we were 
interested in latency and running time, we were forced to make somewhat 
different identifications. A trial in the model was no longer an experi- 
mental trial but instead was identified with a small time increment. 
In this way we were able to relate latency and running time to our prob- 
ability variables. (In some experiments we may study two or more 
measures of learning. For example, in the T-maze we could record 
latencies as well as choices. The model we describe offers no direct 
relation between those two measures. Cf. Section 13.2.) 

Rate of responding can also be related to probabilities of appropriate 
response classes, as has been demonstrated in the literature [1, 2]. Further- 
more, there is little difficulty in handling behavioral measures such as 
number of errors and resistance to extinction. On the other hand, an 
important experimental measure of behavior we have not successfully 
handled with our general model is response intensity. In this category 
we include amount of saliva flow in Pavlovian conditioning, galvanic 
skin reaction, kick deflection, and grams pull as used in conflict experi- 
ments. These measures are concerned with the strength or intensity of 
an operationally defined class of behavior, whereas our general model 
considers only the occurrence or non-occurrence of a particular behavior 
class. Although it is possible to conceive of identifications which would 
force intensity measures into our general framework, we have found no 
really satisfactory way of doingthis. The principal suggestion that comes 
to mind is to set up a distribution function for the intensity and to make 
the operators change a parameter of the distribution. As in the latency 
problem, this method entails considerable bother. 


153 BASIC ASSUMPTIONS 


One of the basic axioms of the general model is the assumption of 


path independence: that the set of probabilities after an event has occurred 
depends only upon the set of probabilities just prior to the event and upon 
an operator associated with that event. For example, if some event occurs 
on each trial (and not at other times), and if we have only two alternatives 
with probabilities p, and 1 — p, on trial n, then Pray depends only on Pn 
and on what event occurred on trial n; the values of p p etc., are 
irrelevant. Heuristic objections can be raised against asun io of 
path independence; the model does not seem to provide for “memory,” 
for a “practice effect,” or for long-range effects of “trauma.” 

Whether or not such objections are appropriate depends, we believe, 
upon the specific event identifications which are made. In rinciple, We 
can always make identifications that impose as long 3 i : 


memory” as We 
want. Forexample, suppose we have a sequence of succe 


sses and failures, 


Se oA o 


SEC. 15.4 MATHEMATICAL AND STATISTICAL PROBLEMS 331 


SFFSSSFSS. The obvious identifications are to call S and F the two 
events, but we could just as well consider the sequence to contain four 
kinds of events, SS, SF, FS, FF; or we could consider triplets to be the 
events, etc. Therefore, we take the position that the general model is 
not seriously restricted by the path independence axiom, though our 
specific applications may be misleading here because they make the 
obvious single-trial identifications. The problem is to make the most 
appropriate event identifications for each application. 

The other basic axiom of the general model is the assumption that the 
event operators are linear. We introduced this axiom very early in our 
development because we have never used non-linear operators in handling 
any specific set of data. Nevertheless we recognize a danger in being 
specific too soon: if the linearity axiom proved to be unsatisfactory, 
nearly everything we have said in this book would have to be modified. 
We could have established some rather general results without the linearity 
assumption and then illustrated the implications for the special case of 
linear operators. Or we could have introduced a less severe restriction, 
Such as monotonicity. We chose to do otherwise, however; we intro- 
duced the linearity axiom early so that we could get to actual computations 
and data analysis as soon as possible. We might defend the linearity 
assumption on theoretical grounds by reminding the reader of the stimulus 
Model described in Chapter 2 which yields linear operators and of the 
Savage theorem mentioned in Section 1.8 which shows that only linear 
Operators can satisfy the combining-of-classes restriction. The defense 
9n empirical grounds rests more with the reader’s evaluation of goodness- 
of-fit in the examples given, and with similar evaluations in future appli- 
Cations. 


If it should turn out that the linearity assumption is untenable for a 


Certain class of problems, a possible remedy is the following. An inter- 
Vening variable could change according to linear operators, and its 
relation to response probabilities and other behavioral measures could be 
determined (cf. Section 1.4, last paragraph). In effect, Estes and Burke 
[3] use such a strategy in their general model of stimulus variability. 


15.4 MATHEMATICAL AND STATISTICAL PROBLEMS 


In the Introduction we argued that we should investigate the mathe- 


matical properties of a model before applying it to actual data. In PartI 
We attempted to do just that, but many problems remain unsolved. For 
example, we know relatively little about the asymptotic distribution 
except in some special cases. Karlin [4] and Thompson [5] have obtained 
Tesults which indicate that no simple exact computational scheme is 
Possible. In Chapter 6, therefore, We developed some approximate 


332 EVALUATIONS cH. 15 


methods for studying the asymptotic distribution. The computation of 
moments also raises some mathematical problems which we have not 
solved except in some very special cases. Another of the many problems 
incompletely solved in this book is the distribution of runs, discussed in 
Section 4.8. 

Problems in the statistical estimation of parameters and in measuring 
goodness-of-fit have not been handled satisfactorily in this book. We 
have relied on minor modifications of standard techniques and improvised 
procedures for each specific problem. Furthermore, we usually imposed 
restrictions on the values of the limit points 2; so as to obtain workable 
estimation procedures. Though we obtained estimators for the para- 
meters in each specific model, we often knew little about their properties. 
To compare two estimates, we need better information about their distri- 
butions than we usually provide. More general methods are needed. 


15.5 EXPERIMENTAL PROBLEMS 


Undoubtedly the greatest disappointment to the psychologist in reading 
this book is our apparent lack of attention to a variety of variables which 
have been studied in learning experiments. Our model makes no reference 
to drive strength, amount of reward, amount of work, delay in reinforce- 
ment, etc. We would like to make quite clear that (1) we consider such 
variables to be important in the psychology of learning, and (2) we make 
no pretense of having anything to say about them in this book. We have 
focused attention on the stochastic aspects of learning and performance— 
the detailed properties of sequences of responses observed in experiments. 
In some problems we have illustrated how the parameters in the models 
change when experimental conditions are altered, but we make no attempt 
to explain such changes with the models (see Sections 11.8 and 13.11). 
We could readily introduce ad /ioc assumptions to describe how certain 
parameters depend upon variables like drive strength. In effect this is 
what Hull has done in his system [6]. We have chosen to leave these 
questions, which are parametric in our model, both to experimental 
investigation and to psychological theories in which they are not para- 
metric. On the other hand, we hope that our general stochastic model 
will provide a framework within which such questions can be discussed. 

We have not attempted to handle experimental phenomena such as 
stimulus generalization, discrimination, spontaneous recovery, patterns 
of reinforcement, secondary reward, and response chaining. 


15.6 THEORETICAL INTERPRETATIONS 


Throughout this book we have attempted to divorce our model from 
particular psychological theories and have constantly talked about 


SEC.. 15.7 CONCLUDING REMARKS 333 


"describing" data rather than "explaining" it. We believe that many 
interpretations of our results are possible, and in a few places we have 
suggested some psychological interpretations. Nevertheless, we recognize 
that we have been strongly influenced by the Hullian and Guthrian schools 
of thought, and the general model undoubtedly exhibits this bias. We 
have not tried to relate the model to cognitive theory, for example. In 
Some ways our approach is similar to Skinner's: we have attempted to 
stay close to data and observable acts and events. We have not presented 
a model of the organism but have developed models for experiments. 


15.7 CONCLUDING REMARKS 


We have often been asked to explain the sense in which our general 
model, or one of our specific models, represents anything more than 
curve-fitting. Furthermore, we have been asked to explain the sense in 
which the parameters of the model are psychologically meaningful. 
Unfortunately such a discussion could come only after applications had 
been provided. An explanation can best go forward if we think of an 
idealized situation in which the model is correct in every particular. 
People seriously asking about these issues understand perfectly well that, 
in the end, the question of goodness-of-fit will arise, but they are puzzled 
about the status of the modeleven if it consistently fits real data very closely. 
Therefore we set aside the goodness-of-fit question for the moment. 
Part of the trouble arises from the fact that if a sequence of rather similar 
experiments, say rote-learning studies, is planned, our model does not 
Predict in advance the various parameters that will be found. Once this 
is understood, the original questions arise with still more force. 

Suppose a learning experiment, involving 1000 subjects for 300 trials 
each, has recorded either a success or a failure for each subject on each 
trial, These data are turned over to a clerk for summary analysis. 
There are now literally thousands of independent questions that can be 
asked about these data without ever getting down to the level of the single 
Cell (outcome for a particular subject on a particular trial). To make 
the point clear we shall list a few of the more usual questions and a few 
of the more esoteric ones. For our purpose it does not matter whether 
Or not a psychologist would be interested. in the entire range of such 
questions. The point is that each can be asked of the data and an answer 
Could be obtained. Examples are: 


: es? 
I. What is the overall percentage of successes - 


2. What are the trial-by-trial success percentages ? 
3. What is the best-fitting cubic curve that can be drawn through the 


trial-by-trial percentages ? 


334 EVALUATIONS CH. 15 


4. What is the mean trial number on which the last failure occurred ? 
(How about the median instead of the mean?) 

5. What is the mean trial number on which the first success occurred ? 
(The second, the third, - - - ?) 

6. What is the average number of runs of successes? | failures? 

7. What percentage of subjects had at least twenty successes in a row? 

8. What percentage of the subjects had failures on the third, fourth, 
ninth trials and successes on the eighteenth and twentieth? (This question 
can be varied thousands of ways.) 

9. What is the correlation between numbers of successes in the first 
twenty and last twenty trials? 

10. What is the variance of the subjects’ total scores? 


Even this small group of questions makes clear the rich variety of 
possible summary questions that can be asked of this simple data sheet. 
A complete summary could clearly take hundreds of pages. The clerk 
working over these data would just have to compute 
response to each question. There would be n 
the answers to the different questions. 
questions would not help appreciably i 

The clerk may find it convenient and 


of curve-fitting in preparing to answer these questions. For example, 
in question 5 above, he might compute the mean trial of the nth success 
for each n in the data, and then fit one of the common curves (linear, 
quadratic, logarithmic, or exponential function) to those means. When 
asked about the mean trial number for the third Success, he might report 
the fitted answer rather than the one computed fi 


rom the raw data. This 
answer might be preferable because local fluctuations are smoothed out 
by the fitted function. The curve itself may be of interest because it 


summarizes the trend of the observed means. However, when the clerk 
is asked question 9, for example, he must return to the data, and if he 
chooses can do some additional curve fitting for Correlation coefficients. 
Hence, for each class of questions a new curve could be fitted. Many 
classes of questions are possible, however, and therefore many numerical 
functions would be required to summarize all the information in the data- 

A different level of analysis can be accomplished with a model. Sup- 
pose our model fitted closely the results of experiments like the one 
described. Then after a few questions had been asked and answered 
(say questions 1, 5, 6), we would obtain a few numbers, say three (we will 
regard these as parameters). With these three numbers we would be 
able to retire from the data and be prepared to answer question for 
question with the clerk. Our answers would not agree perfectly with 


a new number in 
o necessary relation between 
Knowing answers to one hundred 
n answering the next hundred. 
useful to employ classical methods 


SEC. 15.7 CONCLUDING REMARKS 335 


the clerk's responses, but they would be generally close. Thus on the 
basis of three numbers we are prepared, in principle, to answer all the 
questions the original data sheet can answer provided that the questions 
do not get down to the level of a single cell. ("In principle," means it 
might take us a long time, but with computing machines we could do it.) 
This is saying a great deal. Furthermore, if we are given the three 
numbers we can work out in advance the answers to the questions, just 
in case someone ever comes across an experiment that has these three 
parameters. In addition, we would be glad to turn the rules for finding 
the three numbers over to someone else, and tell him how to generate 
the various answers so that when an experiment is done he could see for 
himself just how the generated answers agree with the results of the 
experiment. Thus for three numbers derived from the data we plan to 
be able to answer a wealth of questions. 1f these three numbers plus the 
model can answer every possible question above the level of the single 
cell, it is hard to see what more succinct way there is to summarize the 
data (unless another model can do it with fewer parameters). 

Now what about the psychological meaningfulness of the three para- 
meters? In almost any situation like the one we describe, if there are 
three numbers that will do all this heavy labor, there will be many other 
Sets of three numbers derivable from the first three that can also be used 
to achieve the same end. Some of these sets of three numbers can be 
given familiar names and descriptions more easily than others. Further- 
More some sets of parameters that appear strange initially seem, after 
maturer reflection and experience, more suitable than others. And, 
again, which set is more suitable sometimes depends on the specific 
nature of the question. In elementary mathematics we are used to having 
both rectangular and polar coordinates available (two of the many 
Possible kinds), and we seldom, if ever, see a discussion as to which is 
more meaningful, though there is plenty of evidence that some problems 
that are easy in one set become vicious when discussed in the other. 
In the course of our own work we have gradually turned from the os 
that quantities like Po @;, and b; are the natural parameters to the view 
that po, 2,, and æ; are easier to work with and lead more smoothly to 
generalizations. Yet each set has its own simple meaning, and one easy 
to explain and use. Thus preference for equivalent basic sets of para- 
meters may be partly determined by individual taste, but more strongly 
determined by the services rendered by the particular sets. ; , 

The description of our model in an idealized situation is not unlike 
the situation. with Kepler's laws for heavenly bodies. After working 
With a large set of numbers, he stated a few simple laws that explain 

` (describe) a great deal of what had been measured. On the other hand, 


336 EVALUATIONS CH. I5 


if a new heavenly body were to appear, the | 
absolutely quantitative about the motion. 
laws a few accurate measurements for the particular body, its course 
both before and after the Observations could be described in specific 
detail. There is another similarity too; the chemical composition of the 
body, its temperature, color, and other Properties that would interest an 
observer were not included in the laws, Again, for our model there are 
many protocol features and quantitative measurements of deep importance 


to a psychologist that not only are not handled but that also we see no 
hope of handling. 


Reluctantly, we rouse ourselves a 
world in which our model reproduces p: 
pose that when we compare the clerk 
answers generated by the b 
hundreds of pages of close 
of questions which the mo 
usual procedure is to try 
rise to such error, 


aws tell us very little that is 
But given in addition to the 


little from this delightful dream 
sychological data perfectly. Sup- 
"s detailed computations with the 
asic three numbers that there are indeed 
agreement, but that there seems to be a class 
st motherly comparison finds wanting. The 
to find what special conditions have given 
Indeed, reasoning from such a discrepancy in Kepler's 
laws, a new planet was found. The model suggested both where not to 
look further, and also Where investigation was needed. Sometimes such 
discrepancies lead to reformulations of the model, sometimes to interesting 


discoveries. But in any case a model that predicts a great many things 
nearly correctly will almost Certainly serve as a temporary baseline 
until 


a more comprehensive mod 
an outmoded model is 


lo à narrower sphere. 


The feature of a baseline Provided by a model may be viewed in another 
important way. It changes considerably our practical view of data. 
In much of Psychology the search is for statistically significant differences, 
and the aim is to show, or at least find out, whether different conditions 
lead to different results. Big differences are usually a source of pu 
satisfaction. With a quantitative Model the emphasis is usually in the 
reverse direction. We look for close agreement with the model, and 
regard large differences with dissatisfaction as owing to a lack of under- 
standing. The lack may, of course, stem from the inadequacy or m- 
appropriateness of the model; it Might also stem from inadequacies of 


the design and execution of the Experiment, but more hopefully it may 
stem from a principle soon to be discovered. 

Lest some reader has entered this disc 
him that we have not had a fit of 
grandeur: we are merely explainir 
the most favorable conditions im 


el appears. Nor is it always — 
always discarded; its use may merely be restricte 


i / ind 
ussion in medias res, We ie 
arrogance, nor do we have delusions 0 


ng the value of our type of model under 
aginable, 


2. B 
ush, R. R., and Mosteller, F. A mathematical model for simple learning. 
- Est 
es, W. K., and Burke, C.J. A theory of stimulus variability in learnin 


+ Karli 
in en 
» S. Some random walks arising in learning models I. 


$. th 


REFERENCES 337 


1. Es REFERENCES 5 
: Estes, W, K. Toward a statistical theory of learning. Psychol. Rev., 1950, 57, 


94-107. 

Rev., 1951, 58, 313-323. 
Rev., 1953, 60, 276-286. 
1953, 3, 725-156. 


h 
9mpson, G, L., unpublished work. 


Ou, b i 
€. &. Principles of behavior. 


Psychol. 
g Psychol. 


Pacific J. of Math., 


New York: Appleton-Century-Crofts, 1943. 


Tables 


TABLE A 
The two functions, 


oo 
P(x, B) = x aper D/2B 

v=0 
We, f) = Svar Deep, 

=0 

which occurred in Sections 4.8, 8.3, and 8.4 in the computation of run lengths. 
This table is used in Sections 9.6 and 11.4 for estimating a parameter and the 
Variance of that estimate. The first entry in each cell is D(a, p) and the et 
entry is 1'(@,), (The table was prepared by transforming functions compute 


by the Harvard Computation Laboratory.) 


» 


339 


340 TABLE A 
TABLE A 
N .50 52 .54 .56 58 60 62 

.50 1.2833 1.2961 1.3090 1.3220 1.3352 1.3485 1.3619 
.3186 3345 .3506 3670 3837 -4006 A179 
52 1.2977 1.3113 1.3250 1.3389 1.3529 1.3671 1.3814 
.3381 .3552 3726 .3904 4084 4268 4455 
“54 1.3126 1.3270 1.3416 1.3563 1.3713 1.3863 1.4016 
.3586 3771 .3959 4151 4346 4546 4749 
56 1.3280 1.3433 1.3587 1.3744 1.3902 1.4063 1.4225 
.3802 4001 .4205 4412 4624 .4841 -5061 
.58 1.3438 1.3600 1.3765 1.3931 1.4100 1.4271 1.4443 
.4030 4245 4465 .4690 4920 .5155 +5395 
.60 1.3602 1.3774 1.3949 1.4126 1.4305 1.4487 1.4671 
4271 4504 4742 .4986 .5235 .5491 5152 
.62 1.3772 1.3955 1.4140 1.4328 1.4519 1.4712 1.4909 
4527 44779 .5037 -5301 5573 .5851 :6136 
.64 1.3949 1.4142 1.4339 1.4539 1.4742 1.4948 1.5158 
4799 .5072 .5352 .5639 .5934 .6238 .6549 
66 1.4132 1.4338 1.4547 1.4759 1.4976 1.5195 1.5419 
.5090 .5385 -5689 .6002 6324 6655 6996 
68 1.4324 1.4542 1.4764 1.4990 1.5221 1.5455 1.5695 
-5401 .5722 .6052 .6393 .6745 7107 -7480 

4170 1.4524 1.4755 1.4992 1.5233 1.5479 7 ECCE 
‘5735 “6084 16444 16816 7201 13598 8009 
72 1.4733 1.4980 1.5231 1.5488 1.5751 1.6019 1.6293 
6095 6475 6868 7276 7698 .8135 8588 
14 1.4953 1.5215 1.5484 1.5758 1.6039 1.6327 1.6621 
.6484 .6899 330 3178 .8243 .8725 .9225 
.16 1.5184 1.5464 1.5750 1.6044 1.6345 1.6654 1.6971 
.6907 7362 .7836 8329 .8842 .9376 :9931 
78 1.5428 1.5727 1.6034 1.6349 1.6672 1.7004 1.7346 
.7369 .7869 .8391 .8936 .9505 1.0099 1.0718 
.80 1.5687 1.6007 1.6336 1.6674 1.7022 1.7381 1.7750 
7876 -8428 -9006 -9611 1.0245 1.0909 1.1603 
.82 1.5963 1.6306 1.6659 1.7024 1.7400 1.7788 1.8188 
.8437 .9049 9691 1.0367 1.1077 1.1823 1.2606 
84 1.6258 1.6626 1.7007 1.7401 1.7809 1.8231 1.8667 
.9061 .9742 1.0461 1.1220 1.2020 1.2865 1.3755 
86 1.6575 1.6972 1.7384 1.7812 1 8256 1.8716 1.9194 
.9763 1.0526 1.1336 1.2193 1.3103 1.4067 1.5088 
.88 1.6918 1.7348 1.7796 1.8262 1.8748 1.9253 1.9780 
1.0558 | 1.1421 | 1.2340 | 1.3319 | 14362 | 15474 | 1.6658 
.90 1.7291 1.7760 1.8250 1.8761 1.9296 2.0440 
1.1472 1.2455 1.3509 1.4640 1.5851 12150 1.8543 
92 1.7703 1.8216 1.8755 1.9320 1.9914 2.0538 2.1194 
1.2538 1.3673 1.4897 1.6220 1.7648 1.9193 2.0862 
94 1.8161 1.8728 1.9326 1.9958 2.0624 2.1330 2.2076 
1.3806 | 1.5136 | 1.6585 | 1.8162 | 19881 | 21759 | 2.3810 
96 1.8680 1.9313 1.9985 2.0699 2.1460 22211 2.3137 
1.5358 | 1.6950 | 1.8703 | 2.0638 | 2271 | 2:509 | 2.7744 
98 1.9280 1.9998 2.0766 | 2.1592 2.2480 2.3438 2.4474 
1.7320 | 1.9302 | 2.1508 | 2.3958 | 2.6744 | 2.9862 | 3.3400 


TABLE A 


TABLE A (continued) 


341 


N .64 .66 .68 30 m 74 -16 

-50 1.3755 1.3891 1.4029 1.4168 1.4309 1.4451 1.4594 
4354 4531 4712 .4895 .5082 5271 5463 

152 1.3958 1.4104 1.4252 1.4400 1.4581 1.4703 1.4856 
4645 4839 .5036 .5236 .5439 .5646 .5857 

54 1.4170 1.4325 1.4483 1.4642 1.4803 1.4965 1.5130 
4956 .5166 .5381 .5599 .5822 .6048 .6279 

56 1.4389 1.4556 1.4724 1.4894 1.5066 1.5240 1.5416 
.5286 .5516 .5150 .5989 .6232 .6481 6733 

58 1.4619 1.4796 1.4976 1.5157 1.5342 1.5528 1.5717 
.6146 .6408 6674 6947 3224 

60 1.5239 1.5434 1.5631 1.5831 1.6034 
6573 6859 2152 7451 1137 

-62 1.5516 1.5724 1.5936 1.6150 1.6368 
.7034 22347 7669 .1998 .8335 

64 1.5807 1.6030 1.6257 1.6487 1.6722 
.7533 7878 .8232 .8595 .8968 

-66 1.6114 1.6353 1.6597 1.6845 1.7097 
.8076 .8456 .8847 .9249 9662 

-68 1.6439 1.6696 1.6958 1.7225 1.7497 
.8670 .9090 9523 9968 1.0427 

70 1.6784 1.7061 1.7343 1.7631 1.7925 
9322 .9787 1.0268 1.0764 1.1276 

72 1.7152 1.7450 1.7755 1.8067 1.8385 
5 1.0041 1.0560 1.1095 1.1650 1.2223 
m 1.7545 1.7868 1.8198 1.8536 1.8881 
ES 1.0841 1.1420 1.2020 1.2642 1.3286 
: 1.7968 1.8318 1.8676 1.9043 1.9420 
a 1.1735 1.2384 1.3060 1.3761 1.4491 
: 1.8426 1.8806 1.9196 1.9597 2.0009 
1.2742 1.3475 1.4240 1.5036 1.5867 

80 1.8923 1.9338 1.9764 2.0204 2.0656 
32 1.3887 1.4720 1.5591 1.6502 1.7456 
: 1.9468 1.9922 2.0391 2.0875 2.1376 
1.5202 1.6156 1.7157 1.8208 1.9312 

34 2.0069 2.0570 2.1088 2.1625 2.2182 
1.6731 1.7833 1.8996 2.0221 2.1514 

En 2.0740 2.1295 2.1873 2.2473 2.3097 
88 1.8535 1.9824 2.1190 2.2637 2.4171 
` 2.1497 2.2118 2.2767 2.3444 2.4152 
2.0701 2.2230 2.3860 2.5599 2.7452 

is 2.2364 2.3068 2.3805 2.4579 2.5391 
92 2.3364 2.5212 2.7198 2.9329 3.1619 
i 2.3378 2.4186 2.5038 2.5937 2.6887 
94 2.6736 2.9027 3.1509 3.4199 3.7117 
B 2.4596 2.5542 2.6548 2.7618 2.8760 
9 3.1188 3.4131 3.7355 4.0893 4.4779 
de 2.6117 2.7259 2.8487 2.9810 3.1238 
3.7439 4.1428 4.5869 5.0840 5.6393 

i 2.8146 2.9598 3.1188 3.2936 3.4862 
4.7179 5.3128 5.9962 6.7814 7.6913 


342 


TABLE A 


TABLE A (continued) 


N Er | .80 


.82 


.84 86 .88 
.50 1.4738 1.4884 1.5031 1.5180 1.5329 1.5480 
.5658 .5856 .6058 .6262 .6469 .6680 
Ev 1.5011 1.5167 1.5325 1.5485 1.5646 1.5809 
.6071 .6288 .6509 6734 .6962 21194 
54 1.5296 1.5464 1.5633 1.5805 1.5978 1.6154 
6514 .6753 .6996 7243 7495 7151 
56 1.5595 1.5775 1.5957 1.6142 1.6328 1.6517 
.6991 7254 -1522 1195 .8073 .8356 
58 1.5908 1.6102 1.6298 1.6497 1.6698 1.6902 
-1508 .7798 .8093 .8394 .8702 .9015 
.60 1.6239 1.6448 1.6659 1.6873 1.7090 1.7310 
-8069 +8389 8715 -9049 -9389 9737 
.62 1.6589 1.6813 1.7041 1.7271 1.7506 1.7743 
.8681 .9034 .9395 9765 1.0144 1.0531 
.64 1.6959 1.7201 1.7447 1.7696 1.7949 1.8206 
.9350 9742 1.0143 1.0555 1.0976 1.1409 
.66 1.7353 1.7614 1.7880 1.8149 1.8424 1.8703 
1.0086 1.0521 1.0969 1.1428 1.1899 1.2383 
.68 1.7774 1.8056 1.8343 1.8636 1.8934 1.9237 
1.0899 1.1385 1.1885 1.2399 1.2928 1.3472 
.70 1.8225 1.8530 1.8842 1.9160 1.9484 1.9815 
1.1803 1.2348 1.2909 1.3487 1.4083 1.4698 
32 1.8710 1.9042 1.9381 1.9728 2.0082 2.0444 
1.2815 1.3427 1.4060 1.4713 1.5388 1.6086 
4 1.9235 1.9597 1.9967 2.0346 2.0734 2.1131 
1.3954 1.4646 1.5364 1.6106 1.6875 1.7672 
46 1.9806 2.0202 2.0608 2.1024 2.1451 2.1889 
1.5249 1.6036 1.6854 1.7703 1.8585 1.9501 
78 2.0432 2.0867 2.1314 2.1773 2.2245 2.2730 
1.6733 1.7635 1.8574 1.9553 2.0573 2.1635 
.80 2.1123 2.1603 2.2098 2.2607 2.3132 2.3673 
1.8452 1.9494 2.0584 2.1722 2.2913 2.4156 
.82 2.1892 2.2425 2.2977 2.3546 2.4134 2.4742 
2.047] 2.1687 2.2963 2.4302 2.5707 2.7182 
.84 2.2758 2.3355 2.3974 2.4615 2.5280 2.5969 
2.2876 24313 2.5827 2.7423 2.9106 3.0880 
86 2.3745 2.4420 2.5122 2.5852 2.6612 2.7403 
2.5797 2.7520 2.9347 3.1282 3.3333 3.5505 
.88 2.4890 2.5661 2.6467 2.7310 2.8191 2.9112 
2.9428 3.1535 3.3783 3.6179 3.8736 4.1463 
.90 2.6244 2.7140 2.8081 2.9071 3.0112 3.1207 
3.4078 3.6722 3.9563 4.2618 4.5902 4.9435 
92 2.7892 2.8954 3.0079 3.1270 3.2533 3.3872 
4.0282 4.3716 4.7446 5.1497 5.5898 6.0683 
94 2.9977 3.1278 3.2667 3.4154 3.5747 3.7456 
4.9050 5.3750 5.8922 6.4621 7.0905 7.1842 
96 3.2782 3.4454 3.6268 3.8239 4.0385 4.2725 
6.2624 6.9614 7.7470 8.6324 9.6303 10.7581 
98 3.6992 3.9358 4.1992 4.4938 4.8244 5.1971 
8.7458 9.9742 11.4083 13.0912 15.0756 17.4229 


TABLE A 


TABLE A (continued) 


343 


^v .90 92 94 96 98 1.00 
50 1.5633 1.5787 1.5942 1.6099 1.6257 1.6416 
.6893 .7110 .7330 .7553 .7780 .8009 
52 1.5973 1.6139 1.6306 1.6476 1.6646 1.6819 
1430 -1669 7913 .8160 «8411 .8666 
54 1.6331 1.6510 1.6690 1.6873 1.7058 1.7245 
8011 .8276 .8546 8820 .9099 .9382 
56 1.6708 1.6901 1.7097 1.7294 1.7494 1.7697 
.8644 .8938 .9237 .9542 .9852 1.0168 
58 1.7108 1.7317 1.7528 1.7742 1.7959 1.8178 
.9335 .9661 .9994 1.0333 1.0679 1.1031 
«60 1.7532 1.7758 1.7987 1.8219 1.8454 1.8692 
1.0093 1.0456 1.0827 1.1205 1.1592 1.1986 
62 1.7984 1.8229 1.8477 1.8729 1.8984 1.9243 
1.0928 1.1333 1.1747 1.2171 1.2604 1.3047 
64 1.8468 1.8733 1.9003 1.9277 1.9555 1.9837 
1.1852 1.2305 1.2770 1.3246 1.3734 1.4233 
+66 1.8987 1.9275 1.9569 1.9867 2.0171 2.0480 
1.2880 1.3390 1.3913 1.4450 1.5001 1.5565 
68 1.9546 1.9861 2.0181 2.0508 2.0840 2.1178 
1.4032 1.4607 1.5199 1.5807 1.6432 1.7074 
10 2.0153 2.0497 2.0847 2.1205 2.1570 2.1942 
1.5330 1.5983 1.6655 1.7347 1.8060 1.8794 
72 2.0813 2.1191 2.1576 2.1970 22373 2.2784 
" 1.6806 1.7549 1.8317 1.9110 1.9928 2.0772 
z 2.1538 2.1953 2.2379 2.2814 2.3260 2.3716 
= 1.8496 1.9349 2. 2.1147 2.2093 2.3071 
ai 2.2338 2.2798 2. 2.3754 2.4250 2.4759 
bs 2.0452 2.1439 2 2.3525 2.4628 2.5772 
2.3229 2.3741 2. 2.4809 2.5365 2.5937 
2.2741 2.3892 2. 2.6339 2.1637 2.8988 
80 2.4230 2.4805 2.5396 2.6006 2.6635 2.7282 
2.5456 2.6814 2.8232 2.9714 3.1262 3.2878 
182 2.5370 2.6019 2.6690 2.7383 2.8100 2.8841 
2.8728 3.0351 3.2052 3.3837 3.5708 3.7670 
84 2.6684 2.7425 2.8194 2.8991 2.9818 3.0677 
š 3.2749 3.4719 3.6794 3.8981 4.1285 4.3713 
#6 2.8226 2.9084 2.9977 3.0907 3.1876 3.2886 
8 3.7808 4.0248 4.2834 4.5574 4.8476 5.1552 
"S 3.0076 3.1085 3.2141 3.3247 3.4405 3.5619 
4.4371 4.7475 5.0785 5.4317 5.8085 6.2106 
— 
i80 3.2360 3.3574 3.4853 3.6202 3.7624 3.9124 
9 5.3234 5.7322 6.1721 6.6454 7.1549 7.1035 
‘22 3.5293 3.6802 3.8406 4.0112 4.1926 4.3858 
T 6.5886 7.1548 7.7709 8.4416 9.1722 9.9682 
3*4 3.9289 4.1260 4.3379 4.5661 4.8121 5.0776 
" 8.5500 9.3966 10.3327 11.3692 12.5176 13.7904 
28 4.5283 4.8083 5.1154 5.4530 5.8248 6.2349 
12.0342 13.4811 15.1233 16.9914 19.1190 21.5464 
38 5.6190 6.0986 6.6465 7.2751 8.0000 8.8400 
20.2135 23.5438 27.5385 32.3474 38.1675 45.2450 


TABLE B 


The function 
1 


x 


—log — 
= ER A 
T (œ, B) = -AR 
defined in Section 8.2 and used in Sections 9.6 and 11.5 for estimating model 
parameters. Since T(«, p) = T(f, «), the parameters « and-f may be read along 
either margin. (The table was computed by Cleo Youtz and Lotte Bailyn.) 


N 0.70 


0.70 | 3.333 | 3.390 | 3.450 | 3.512 | 3.578 | 3.646 | 3.719 | 3.796 | 3.877 | 3.963 


0.71 0.72 0.73 0.74 0.75 0.76 0.77 0.78 0.79 


0.71 3.448 | 3.510 | 3.573 | 3.640 | 3.711 | 3.785 | 3.864 | 3.947 | 4.035 
0.72 3.571 | 3.636 | 3.705 | 3.777 | 3.854 | 3.934 | 4.019 | 4.110 
0.73 3.704 | 3.774 | 3.848 | 3.926 | 4.009 | 4.096 | 4.189 
0.74 3.846 | 3.922 | 4.003 | 4.087 | 4.177 | 4.272 
0.75 4.000 | 4.083 | 4.170 | 4.261 | 4.359 
0.76 4.167 | 4.256 | 4.351 | 4451 
E 4.348 | 4.445 | 4.549 

. 4.652 
nes a 4.545 


4.762 


N 0.80 | 0.81 | 0.82 |0.83 | 0.84 | 085 | 086 | 087 | 088 | 0.89 


0.70 | 4.055 | 4.152 | 4.257 | 4.369 | 4.490 | 4.621 | 4.763 | 4.919 | 5.091 | 5.281 
0.71 | 4.129 | 4.229 | 4.336 | 4.451 | 4.575 | 4.709 | 4,855 | sois | s.191 | 5.386 
0.72 | 4.206 | 4.308 | 4.418 | 4.536 | 4.663 | 4.801 | 4.951 | 5.115 | 5.296 | 5.496 
0.73 | 4.287 | 4.393 | 4.505 | 4.626 | 4.757 | 4.898 | 5.052 | 5.221 | 5.406 | 5.612 
0.74 | 4.373 | 4.481 | 4.597 | 4.721 | 4.855 | 5.000 | 5.159 | 5.332 | 5.523 | 5.735 
0.75 4.463 | 4.574 | 4.693 | 4.821 | 4.959 | 5.108 
-76 | 4.558 | 4.672 | 4.795 | 4.926 | 5.068 | 5.222 | 5.390 | 5.574 | 5.776 | 6.001 
0.77 | 4.659 | 4.776 | 4.902 | 5.038 | 5.184 | 5.343 | 5.516 | 5,705 | 5914 | 6.147 
0.78 | 4.766 | 4.887 | 5.017 | 5.157 | 5.308 ; 


5.471 | 5.650 | 5.845 | 6.061 | 6.301 
0.79 | 4.879 | 5.004 | 5.138 | 5.283 | 5.439 | 5.608 | 5.792 | 5995 6218 | 6.466 


5.271 | 5.449 | 5.646 | 5.864 


0.80 | 5.000 | 5.129 | 5.268 | 5.417 | 5.579 


5.754 | 5. j 6.643 
0.81 ‘| 5.263 | 5.407 | 5.562 | 5728 | 5910 | etes 6425 6365 | 6832 
0.82 5.556 | 5.716 | 5.889 | 6.077 | 6283 | 6.508 | 6.758 | 7.035 
0.83 5.882 | 6.062 | 6.258 | 6.472 | 6.707 | 6.966 | 7.255 
0.84 6.250 | 6.454 | 6.677 | 6.921 | 7.192 | 7.494 
0.84 | NL 
0.85 6.667 | 6.899 | 7.155 | 7.438 | 7.754 
0.86 7.143 | 7.411 | 7.708 | 8.039 
0.87 7.692 | 8.004 | 8.353 


8.701 
MES 8333 | 870 


344 


TABLE B 345 


TABLE B (continued) 


N 0.90 | 0.91 | os 0.93 | 0.94 | 0.95 | 0.96 | 0.97 | 0.98 | 0.99 


0.70| 5.493 | 5.733 | 6.008 | 6.327 | 6.706 | 7.167 | 7.750 | 8.528 | 9.672] 11.728 
0.71| 5.604 | 5.850] 6.133| 6.461 | 6.850 | 7.324| 7.924| 8.726 | 9.904| 12.026 
0.72| 5.720] 5.974| 6.264| 6.601 | 7.002| 7.490 | 8.108 | 8.934 | 10.150 | 12.341 
0.73| 5.843 | 6.103 | 6.402 | 6.750 | 7.162| 7.665 | 8.302 | 9.155 | 10.411 12.676 
0.74| 5.972| 6.240 | 6.548 | 6.906 | 7.332 | 7.851| 8.508 | 9.389 10.687 | 13.032 


0.75} 6.109 | 6.385| 6.703| 7.072] 7.511 | 8.047 | 8.727 | 9.638 | 10.981 | 13.412 
0.76| 6.253} 6.539 | 6.866 | 7.248 | 7.702] 8.256 | 8.959 | 9.902 | 11.295] 13.818 
0.77| 6.407 | 6.702 | 7.040| 7.435 | 7.904 | 8.478 | 9.206 | 10.184 | 11.630 | 14.252 
0.78| 6.570 | 6.876 | 7.226 | 7.634 | 8.121 | 8.715 | 9.471 | 10.486 | 11.989 | 14.719 
0.79| 6.745 | 7.061 | 7.424 | 7.847 | 8.352] 8.969 | 9.754 | 10.811 | 12.376 | 15.223 


0.80| 6.931| 7.259 | 7.636 | 8.076 | 8.600 | 9.242 | 10.059 | 11.160 | 12.792 | 15.767 
0.81 | 7.132| 7.472| 7.864 | 8.321 | 8.867 | 9.536 | 10.388 | 11.536 | 13.243 16.358 
0.82| 7.347 | 7.702 | 8.109 | 8.586 | 9.155 | 9.853 | 10.743 | 11.945 | 13.733 17.002 
0.83| 7.580| 7.950| 8.375 | 8.873 | 9.468 | 10.198 | 11.130 12.390 | 14.267 | 17.708 
0.84] 7.833 | 8.220] 8.664 | 9.185 | 9.808 | 10.574 | 11.553 12.877 | 14.853 | 18.484 


0.85] 8.109 | 8.514 | 8.980 | 9.527 | 10.181 | 10.986 | 12.016 | 13.412 15.499 | 19.343 
O.86| 8.412 | 8,837 | 9.327 | 9.902 | 10.591 | 11.440 | 12.528 | 14.004 | 16.216 20.300 
0.87| 8.745 | 9.193 | 9.710 | 10.317 | 11.046 | 11.944 | 13.096 | 14.663 | 17.016 | 21.375 
0.88 | 9.116 | 9.590 | 10.137 | 10.780 | 11.553 | 12.507 | 13.733 | 15.403 | 17.918 | 22.590 
0.89 | 9.531 | 10.034 | 10.615 | 11.300 | 12.123 | 13.141 | 14.452 | 16.241 | 18.942 | 23.979 


0.90 | 10.000 | 10.537 | 11.158 | 11.889 | 12.771 | 13.863 | 15.272 | 17.200 | 20.118 | 25.584 


0.91 11.111 | 11.778 | 12:566 | 13.515 | 14.695 | 16.219 | 18.310 | 21.487 | 27.465 
0.92 12.500 | 13.353 | 14.384 | 15.667 | 17.329 | 19.617 | 23.105 | 29.706 
0.93 14.286 | 15.415 | 16.824 | 18.654 | 21.183 | 25.055 | 32.432 
0.94 16.667 | 18.232 | 20.274 | 23.105 | 27.465 | 35.835 
0.95 20.000 | 22.315 | 25.542 | 30.543 | 40.236 
0.96 25.000 | 28.768 | 34.657 | 46.210 
221 33333 | 40.546 | 54.931 - 
0.98 50.000 | 69.315 


0.99 100.000 


TABLE C 
The function g, (x) defined by 


T va” 

& (2) = mn 

This function is introduced in Section 9.10 and is used in Section 11.4 to facilitate 

computation of a maximum likelihood estimate. (The table was computed by , 
D. G. Hays and T. R. Wilson.) 


a 
1 2 3 4 5 6 7 8 9 io 
667 429 267 161 095 055 031 
* Yon 703 459 | 290 | .179 -107 063 037 
52 1.083 741 ET 316 .198 121 1073 1043 
EH 1.128 781 525 343 EI 136 083 050 
754 1.174 823 561 372 | 24 153 095 058 
2 -867 599 403 265 171 108 068 
36 i 2 914 639 | 436 291 191 123 078 
57 1.326 963 682 | .472 320 213 1140 090 
E 1.381 | 1.014 727 510 351 237 158 104 
359 1439 | 1.068 mS .552 385 264 179 119 
1.125 827 596 | .422 294 202 437 
1.185 S81 643 „461 326 227 156 
1.249 939 ,694 504 361 256 179 
1.316 | 1:000 ‘748 551 400 287 204 
1.388 1.066 -806 601 443 322 332 
1.463 | 1.136 |  .869 656 489 
1.544 1211 937 716 41 
1.629 1.290 1.009 780 597 
1.720 1.376 1.088 851 658 
1.818 1.468 1.172 927 726 
1.922 | 1.566 1.00 | .800 
2.033 1.672 1.101 882 
2.153 | 1.787 1.200 | 971 
2282 | 1.910 1308 | 1.070 
2.421 | 2.044 1.426 | 1.179 
2.571 | 2.189 1.556 
2.735 2.347 1,698 
2.913 2.520 1.856 
3.107 2.709 2.030 
3.321 | 2.918 2.222 
3.556 314K 2.437 
3.816 3.403 2.677 
4.105 | 3.687 
4.429 | 4.006 


5.207 4.775 2.996 
5.680 5243 3.416 
6.227 | 5.785 3.909 


6.865 | 6.418 | 5.992 


8.526 

9.635 

11,021 5 
12.804 10.997 10.165 
15.182 13.347 12.490 
18.513 16.649 | 16.205 15.769 
23.510 21.619 | 21.163 | 20.714 
31.841 29.922 | 29.455 | 28.993 
48,505 46.559 | 46.081 45.606 
98.503 


96.529 | 96.040 | 95.553 


TABLE D 


The functions F(x, #, Q) and Gla, //, Q) defined by the equations 


o 


VO x" 
Foi: ) = Iu 


Rae, = wP 
rou 
These functions are described in Sections 9.10 and 9.11, and are used for obtaining 
maximum likelihood estimates in Sections 10.5 and 10.6. The first entry in each 
Cell is F(x, p, Q) and the second entry is G(x, P, $2). (The table was prepared by 
D. G. Hays.) 


347 


348 " TABLE D 
TABLE D 
M Q=4 
B 
50| .55| .60| .65| .70| .72| .74| .76| .78| .80 
50 | 1.57] 170| 1.84] 2.01] 2.21] 2.30] 2.39] 2.50} 2.601 | 2.74 
95] 120) 1.51] 1.89] 2.35] 2:57 | 2.81] 3.07] 3.35} 3.67 
55 | 1.87] 201| 2.17] 2.37] 2.60] 2.71] 2.82] 2.95] 3.09] 323 
1.06| 1.35} 1.70] 2.13] 2.67| 2.92] 3.20] 3.50] 3.84] 4.21 
60 | 2.23 | 2.38 | 2.57| 2.80] 3.07] 3.19] 3.33] 3.48] 3.64] 3.81 
1.18| 1.50] 1.90] 2.39] 3.01 | 3.30] 3.62] 3.97| 4.37] 4.80 
65 | 2.66| 2.84] 3.06] 3.32] 3.63| 3.78] 3.94] 4.11] 4.30] 4.51 
1.30] 1.66] 2.11] 2.67] 3.37] 3.70] 4.07] 4.48] 4.94] 5.45 
30 | 3.23 | 343| 3.67] 3.97] 4.33] 4.51] 4.69] 4.90| 5.13] 5.38 
1.43 | 1.83] 2.33] 2.96] 3.76] 4.14] 4.57] 5.05] 5.58] 6.18 
2| 3.50| 3.71} 3.97| 4.28] 4.67] 4.85] 5.05} 5.27, 5.51] 5.79 
1.49 | 1.90 | 2.43] 3.08] 3.93] 4.33] 4.78] 5.28] 5.85] 6.49 
74 | 3.81 | 4.03] 4.30] 4.63] 5.05] 5.24] 5.45] 5.69] 5.95] 6.24 
1.54] 1.97] 2.52] 321| 4.10] 4.52] 5.00] 5.53| 6.14 | 6.82 
16 | 4.17| 4.40] 4.69] 5.04] 5.47] 5.68] 5.91] 6.16] 6.44 | 6.75 
1.60 | 2.05 | 2.62] 3.34] 4.27] 4.72] 5.23] 5.79 | 6.43] 7.16 
.78 | 4.59] 4.83} 5.13] 5.50] 5.96| 6.18] 6.42| 6.69| 7.00] 7.34 
1.65| 2.12 | 2.72] 3.48 | 4.46] 4.93] 5.46| 6.07| 6.75 | 7.53 
.80 | 5.08 | 5.34| 5.65| 6.04] 6.53| 6.77| 7.03| 731| 7.64 | 8.01 
1.71 | 2.20] 2.82| 3.61] 4.65| 5.15] 5.71] 635| 7.08| 7.91 
82 | 5.68| 5.95| 628| 6.69] 721| 7.46 | 7.74| 8.05| 8.40| 8.80 
1.77 | 2.28 | 2.93| 3.76| 4.85] 5.337| 597| 6.665| 7.42, 8.31 
.84 | 6.41 | 6.69| 7.04| 748| 803| 8.30] 8.60| 893  9.31| 9.74 
1.83 | 2.36| 3.04] 3.91| 5.05| 5.61| 624] 696| 7.78| 8.74 
.86 | 7.35| 7.65| 8.001 | 8.47] 9.06] 9.35 | 9.67 | 10.03 | 10.43 | 10.90 
1.89 | 244| 3.15] 4.06| 5.27] 5.85| 6.53] 729| 817| 9.19 
.88 | 8.58 | 8.89 | 9.28 | 9.77 | 10.39 | 10.70 | 11.04 | 11.43 | 11.87 | 12.38 
1.95| 2.53| 3.27] 422| 549| 6.11| 6.82| 7.64| 8.58| 9.67 
.90 | 10.29 | 10.62 | 11.03 | 11.54 | 12.21 | 12.54 | 12.91 | 13.33 | 13.81 | 14-37 
2.02 | 2.61] 3.39| 4.39| 5.72| 6.38 | 7.14 | 801 | 9.01 | 10.19 
.92 | 12.84 | 13.18 | 13.61 | 14.16 | 14.87 | 15.23 | 15.62 | 16.08 | 16.60 | 17.22 
2.08 | 2.71 | 3.51| 4.56| 5.97| 6.67 | 7.47| 8.39 | 9.47 | 10.74 
94 | 17.06 | 17.41 | 17.87 | 18.45 | 19.21 | 19.59 | 20.02 | 20.52 | 21.09 | 21.76 
2.5| 2.80] 3.64] 4.74] 623| 697| 7.82| 8.81 | 9.97 | 11.34 
.96 | 25.44 | 25.81 | 26.29 | 26.91 | 27.73 | 28.13 | 28.60 | 29.14 | 29.77 | 30.52 
2.22 | 2.89] 3.77] 4.93] 6.50] 7.29] 8.20 | 9.26 | 10.50 | 11.99 
.98 | 50.49 | 50.88 | 51.39 | 52.04 | 52.92 | 53.37 | 53.87 | 54.46 | 55.16 | 55-99 
2.29| 2.99] 3.91] 5.12] 6.78] 7.62] 8.60| 9.73 | 11.08 | 12.70 


— 


TABLE D 349 
TABLE D (continued) 

a Q-4 

B 
.82| .84| .86| .88 .90 .92 94 .96 .98 
50 | 2.87] 302| 318| 3.36 | 3.56 | 3.78 | 403 | 4.31 4.63 
402| 440| 4.83] 5.31 | 5.85 | 646 | 7.16 | 7.97 | 8.90 
S5 | 3.40] 3.58| 3.77| 3.99 | 424 | 4.52] 4.83 | 5.19] 5.61 
463| 509| 560| 619 | 6.85 | 7.61 | 8.49 | 9.53 | 10.75 
60 | 401 | 423| 447| 4.74 | 5.05 | 5.39 | 5.80 | 6.27 682 
529] 584] 647| 7.18 | 7.99 | 8.94 | 10.06 | 11.38 | 13.00 
65 | 475| 501| 531| 5.64 | 602] 646] 698 | 7.60] 835 
603| 669! 7.44] 8.30 | 9.31 | 10.49 | 11.91 | 13.64 | 15.79 
30 | 5.66| so8| 634| 6.76 | 7.24] 7.80 | 8.47 | 9.29 | 10.32 
686| 765| 855] 9.60 | 10.85 | 12.34 | 14.15 | 16.43 | 19.37 
32 | 609| 644| 683| 7.28 | 7.81 | 8.43 | 9.19 | 10.11 11.29 
722| &06| 9.04| 10.18 | 11.53 | 13.17 | 15.19 | 17.74 21.09 
234 | 651| 695| 737| 787| 845| 9.14 | 9.98 | 11.03 12.39 
760| 8.51} 9.56| 10.79 | 12.28 | 14.08 | 16.32 19.20 | 23.04 
76 | 711 {| 752| 799| 8.53 | 9.17 | 9.94 | 10.88 12.08 | 13.64 
800| 897 10.11 | 11.45 | 13.07 | 15.06 | 17.56 20.81 | 2524 
78 | 772| 816| 8.67) 9.27 | 9.98 | 10.84 | 11.91 13.27 | 15.10 
842 | 947 10.70 | 12.16 | 13.93 | 16.13 | 18.93 | 22.63 21.16 
80 | 8.43} 891| 947| 10.13 | 10.92 | 11.88 | 13.09 | 14.66 16.81 
8.871 9,99 | 11.32 | 12.92 | 14.87 | 17.31 | 20.45 | 24.67 30.67 
82 | 925| 978 1039 11.12 | 12.00 | 13.09 | 14.46 | 16.29 18.84 
9.34 | 10.56 | 12.00 | 13.74 | 15.89 | 18.60 | 22.15 | 26.99 34.07 
.84 | 1024 | 10.81 | 11.49 | 12.30 | 13.29 | 14.51 | 16.10 | 18.23 21.30 
9.85 | 11.16 | 12.73 | 14.63 | 17.01 | 20.04 | 24.06 29.66 | 38.09 
.86 | 11.44 | 12:08 | 12.83 | 13.73 | 14.84 | 16.24 | 18.07 20.59 | 24.33 
1038 | 11.80 | 13.51 | 15.61 | 18.24 | 21.64 | 26.22 32.76 | 42.94 
88 | 12.98 | 13.67 | 14.51 | 15.52 | 16.78 | 18.39 20.53 | 23.54 | 28.19 
10.96 | 12:50 | 14.37 | 16.68 | 19.61 | 23.44 | 28.71 36.41 | 48.89 
90 | 15.02 | 15.79 | 16.73 | 17.87 | 19.30 | 21.17 | 23.70 27.37 | 3327 
11.58 |1325 | 15:30 | 17.86 | 21.13 | 25.50 | 31.60 40.78 | 56.39 
92 | 17.93 | 18.79 | 19,84 | 21.14 | 22.80 | 24.99 | 28.03 32.59 | 40.31 
1225 | 14.08 | 16.33 | 19.17 | 22.86 | 27.86 | 35.00 46.13 | 66.13 
94 | 22.56 | 23.53 | 24.71 | 26.20 | 28.14 | 30.75 | 34.48 40.29 | 50.84 
1298 | 1499 | 17.48 | 20.65 | 24.85 | 30.62 | 39.11 52.87 | 79.37 
56 | 31.41 | 32.50 | 33.86 | 35.59 | 37.88 | 41.06 45.76 | 53.49 | 68.81 
13.78 | 15.99 | 18.76 | 22.34 | 27.15 | 3392 44.20 | 61.68 | 98.56 
-98 |57.00 | 58.24 | 59.81 | 61.85 | 64.63 68.60 | 74.78 | 85.71 | 110.55 
14.67 | 17.11 | 20.22 | 24.30 | 29.88 37.98 | 50.77 | 73.97 | 129.54 


350 


TABLE D 


TABLE D (continued) 


" Q=8 
B 
30| 55} 60) .65| .70| 2| 74] 76| 78] s0 
50 | 1.61) 1.75) 1.93] 215] 2.43] 2.57] 2.72] 289| 308 3.30 
1.12 | 1.50} 2.02] 2.74] 3.73} 4.23] 480| 546| 6221 711 
-55 | 1.90} 2.07} 2.27] 2.53] 285] 3.01] 319| 3381 360! 385 
1.25 | 1.68 | 2.26) 3.07] 4.19] 4.76] 5.41 6.16 | 7.03 | 8.05 
-60 | 2.26 | 2.45 | 2.68 | 2.97| 3.34] 3552| 373 3.95 | 4.21 | 4.50 
1.39 | 1.86 | 2.52| 3.42] 4.68| 5.31 | 605| 690 7.89 | 9.05 
65 | 2.70] 291 1 3.17] 3.50] 3.93] 414| 437 4.63 | 4.93 | 5.27 
1.53 | 2.06] 2.78} 3.78} 5.19] 5.90] 673| 7.9 8.81 | 10.12 
70 | 3.27] 3.50} 3.80] 4.17] 4.66] 4.90] 5.17 5.47 | 5.81 | 6.21 
1.67 | 2.26 | 3.06| 4.17] 5.73] 6.53| 7.46 8.54 | 9.80 | 11.29 
72 | 3.54) 3.79] 4.10] 449| 501| 526 5.54] 5.86] 6.22] 6.64 
1.73 | 2.34] 3.17} 4.33] 5.96] 6.79] 7.76 8.89 | 10.21 | 11.78 
74 | 3.86] 4.11] 4.44] 4.85] 530] 566 5.95 | 6.30] 6.68] 7.13 
1.79) 2.42] 3.29] 4.49] 6.191 7.06 8.07 | 9.25 | 10.64 | 12.29 
76 | 4.22) 4.48] 4.82] 5.26] 583 6.11 | 6.43 | 6.79] 720| 7.67 
1.86) 2.51 | 3.41] 4.66] 643| 734 8.40 | 9.63 | 11.09 | 12.81 
78 | 463| 491| 527| 5.73] 633 6.63 | 6.96 | 7.34] 7.78 | 828 
1.92 | 2.60| 3.53| 483| 668| 7.62 8.73 | 10.02 | 11.55 | 13.36 
80 | 5.13] 542! 579| 628 | 691 7.22 | 7.58| 7.98 | 845 | 8.99 
1.98 | 2.69] 3.66] 5.01] 6.93] 792 9.07 | 10.43 | 12.03 | 13.93 
82 | 5.73] 603| 642| 693| 7.60] 793 8.31 | 8.74] 923| 9.81 
2.05 | 2.78} 3.79] 5.19} 719| 825 9.43 | 10.84 | 12.52 | 14.52 
84 | 646| 678 | 7.19| 7.73] 843 8.78 | 9.18 | 9.64 | 10.17 | 10.78 
2.12 | 287! 3.91 | 538| 746| 854 9.80 | 11.28 | 13.04 | 15.14 
86 | 740| 7.73] 816| 8.72] 947 9.84 | 10.27 | 10.75 | 11.32 | 11.98 
2.18 | 2.97] 4.05] 5.57 7.74 | 8.86 | 10.18 | 11.73 | 13.58 | 15.79 
88 | 8.64) 8.99 | 9.44 | 10.03 | 10.82 11.21 | 11.66 | 12.18 | 12.78 | 13.49 
2.26 | 3.07| 419] 5.76] 803 9.20 | 10.58 | 12.20 | 14.14 | 16.47 
:90 110.35 | 10.72 | 11.19 | 11.81 | 12.65 13.06 | 13.54 | 14.10 | 14.75 | 15.51 
235| 36! 4.33] 5.97| 82 | 9.55 | ido 12.70 | 14.73 | 17.18 
92 | 12.90 | 13.28 | 13.77 | 14.43 | 15:32 15.76 | 16.27 | 16.87 | 17.56 | 18.39 
2.40 | 327| 4.47] 6.18] 8.63] 992 11.42 | 13.21 | 15.35 | 17.94 
-94 | 17.11 | 17.51 | 18.03 | 18.72 | 19.67 20.14 | 20.69 | 21,33 | 22.08 | 22.97 
2.47 | 3.37! 4.62} 640| 8.96 | 10.30 11.88 | 13.76 | 16.01 | 18.74 
-96 | 25.50 | 25.92 | 26.46 | 27.19 | 28.19 28.70 | 29.29 | 29.98 | 30.79 | 31.76 
2.55 | 3.48 | 478 | 6.62| 929 | 10.69 12.35 | 14.33 | 16.71 | 19.60 
:98 |50.55 | 50.99 | 51.56 | 52.33 | 53.40 53.94 | 54.57 | 55.32 | 56.20 | 57.27 
2.63 | 3.59] 494| 6.85] 9.65] 1111 12.86 | 14.94 | 17.45 | 20.52 


TABLE D 


TABLE D (continued) 


351 


x Q=8 
B 
82] .84| .86| 88 90 92] 94 96 98 
50 | 3.54] 3.81] 4.13] 4.50] 4.93! 544| 605! 681 7.16 
8.14 | 9.35 | 10.77 | 12.47 | 14.51 | 17.00 | 20.09] 23.99] 29.08 
55 | 4.14] 4.47] 4.84] 5.28 | sso] 642] 718| 813 9.35 
9.24 | 10.64 | 12.30 | 14.30 | 16.72 | 19.70 | 23.46 | 28:32 | 3482 
60 | 4.84] 522| 5.67] 619 | 681] 7.57] &50| 9.59 11:27 
10.41 | 12.03 13.95 | 16.29 | 19.15 | 22.73 | 27.30 | 33.34] 4168 
65 | 5.66] 6.11] 664| 726 | 802| 8.93 | 10.09] 11.60 13.66 
11.68 | 13.53 | 15.76 | 18.48 | 21.86 | 26.13 | 31.70] 39.24] 5002 
‘70 | 6.66} 7.19] 7.82] 856 | 9.47 | 10.60 | 12.04 13.97 | 16.70 
13.05 | 15.18 | 17.75 | 20.92 | 24.90 | 30.01 | 36.81] 4628| 6040 
32 | 7.13} 7.69] 8.36] 9.16 | 10.15 | 11.37 | 1295] 1508 18.16 
13.64 | 15.87 | 18.60 | 21.97 | 26.22 | 31.72 | 39.10 | 49,50 | 65.30 
A4 | 7.64) 8.25] 8.97] 9.83 | 10.89 | 1222 | 13:96 16.33 | 19.81 
14.24 | 16.61 | 19.49 | 23.08 | 27.62 | 33.54 | 41.55 | 52.99 | 70.73 
76 | 822| 8.87] 9.64| 10.57 | 11.72 | 13.17 | 1508 | 1772| 2167 
14.87 | 17.37 | 20.42 | 24.24 | 29.10 | 35.47 | 44.19 | 5680| 76.79 
:78 | 8.87 | 9.56 | 10.39 | 11.40 | 12.65 | 1424 | 16.34] 1929 | 23.80 
15.53 | 18.16 | 21.40 | 25.46 | 30.66 | 37.54 | 47.03 | 60.98 | 83.59 
— 
:80 | 9.62 | 10.36 | 1125 | 12.34 | 13.70 | 1544 | 17.77 21.08 | 26.26 
16.21 | 19.00 | 22.43 | 26.75 | 32.33 | 39.75 | 50.11 | 65.57 | 91.28 
82 | 10.48 | 11.28 | 1224 | 13.42 | 14.90 | 16.82 19.41 | 23.15 | 29.14 
16.93 | 19.87 | 23.51 | 28.12 | 34.10 | 42.13 | 53.47 | 70.66 | 100.07 
:84 (1150 | 12.37] 13.41 | 14.69 | 1631 | 1843 | 2133 25.57 | 32.56 
17.68 | 20.79 | 24.66 | 29.57 | 36.00 | 44.70 | 57.13 | 76.33 | 110.20 
386 1125 | 13.68 | 14.81 | 1622 | 18.00 | 2035 | 23.60] 2846 36.70 
18.46 | 21.76 | 25.87 | 31.13 | 38.04 | 47.50 | 61.17 | 82.72 | 122.03 
88 1433 | 15.33 | 1657 | 18.10 | 20.08 | 22.70 | 2638 | 31.99 | 41.83 
19.30 | 22.78 | 27.16 | 32.79 | 40.25 | 50.55 | 65.66 | 89.97 | 136.04 
rc - 
90 | 16.41 | 17.51 | 18.86 | 20.55 | 22.74 | 25.69 | 29.90| 3645 48.41 
20.17 | 23.88 | 28.55 | 34.59 | 42.66 | 53.92 | 70.69 | 98.30 | 152.94 
9? | 19.37 | 20.57 | 22.05 | 23.93 | 26.39 | 29.74 | 3460| 4237| 3723 
21.11 | 25.04 | 30.04 | 36.54 | 45.30 | 57.67 | 76.39 | 108.05 | 173.83 
94 | 24.04 | 25.36 | 27.00 | 29.10 | 31.88 | 35.74 | 4145] 50:87 69.90 
22.10 | 26.30 | 31.66 | 38.68 | 48.23 | 61.89 | 82.97 | 119.68 | 200.49 
$6 |32.94 | 34.39 | 3622 | 38.59 | 41.79 | 4630 | 53.17] 6495| 90.52 
23.17 | 27.66 | 33.42 | 41.04 | 51.52 | 66.74 | 90.72 | 133.98 | 236.15 
98 | 58.57 | 60.18 | 62.25 | 64.97 | 68.70 | 74.11 | 82.66 | 98.18 | 135.61 
24.33 | 29.14 | 35.38 | 43.70 | 55.28 | 72.43 | 100.17 | 152.47 | 287.86 


352 


TABLE D 


TABLE D (continued) 


æ Q=12 
B 
50 355 .60 .65 -70 E NE .16 78 .80 
.50 1.61 1.75 | 1.94 | 2.17] 249| 2.64| 2.81 | 3.01 | 324| 3.51 
1.13 | 1.55 | 2.13 | 2.98] 425| 493| 5.73 | 6.69| 7.85 | 924 
5 1.90 | 2.07 | 2.28 | 2.55 | 2.91 | 3.09] 3.29] 3.52] 3.78 | 4.09 
1.27 | 1.73 | 2.38 | 3.34| 4.76] 5.53] 6.43] 7.52] 8.83 | 10.41 
.60 | 2.26 | 2.45] 2.69| 3.00] 3.41] 3.61] 3.84] 4.10] 4.41] 4.76 
1.41 1.92 | 2.65 | 3.71 | 5.30] 6.15 | 7.17] 8.39 | 9.86 | 11.64 
.65 2.70 | 2.92 | 3.19 | 3.54| 4.00| 4.23| 4.49] 4.79] 5.14] 5.55 
1.55} 2415| 2.92| 4.10] 5.87] 6.81| 7.95| 9.31 | 10.95 | 12.95 
.70 | 327| 3.51 | 3.81| 4.21] 4.73| 5.00| 5.30| 5.64| 6.04| 6.51 
1.70| 2.32| 3.21 | 4.51 | 6.47] 7.51 | 8.77 | 10.28 | 12.11 | 14.34 
72 | 3.55| 3.79 | 411 | 4.52] 5.08| 5.36] 5.67| 603| 6.46| 6.95 
1.76 | 2.41 | 333 | 4.68} 6.71] 7.81 | 9.11 | 10.69 | 12.60 | 14.93 
44 3.86 | 4.12 | 445| 4.89] 5.47| 5.76| 6.00| 6.48] 6.92] 7.45 
1.82 | 2.49 | 3.45| 4.86| 6.97| 8.11 | 9.47 | 11.11 | 13.10 | 15.53 
-76 | 4.22| 4.49 | 4.84| 5.30| 5.9931| 621| 6.57| 6.997| 7.44 | 8.00 
1.88| 2.58 | 3.57| 5.03| 7.23] 8.41 | 9.83 | 11.53 | 13.61 | 16.15 
78 4.63 | 492| 529| 5.77| 6.41] 673| 710| 7.54| 8.03] 8.62 
1.95 | 2.67} 3.70 | 5.21 | 7.50 | 8.73 | 10.20 | 11.98 | 14.14 | 16.80 
.80 5.13 | 5.43 | 5.81 | 6.32] 6.99| 7.33| 7.73| 8.18| 871| 9.33 
2.0 | 2.76 | 3.83 | 5.40| 7.77| 9.05 | 10.58 | 12.43 | 14.69 | 17.46 
82 | 5.73 | 6.04| 6.44| 6.97| 7.69| 8.05| 8.46| 8.94| 9.50 | 10.16 
2.08 | 2.85| 3.96| 5.59 | 8.05| 9.38 | 10.98 | 12.91 | 15.26 | 18.15 
.84 | 647| 679| 721| 7.77| 8.52] 8.90] 9.34] 9.85 | 10.44 | 11.15 
2.15 | 295| 4.10] 5.79| 8.34 | 9.73 | 11.39 | 13.39 | 15.85 | 18.86 
.86 | 7.40| 7.74| 8.18| 8.77| 9.56 | 9.96 | 10.43 | 10.97 | 11.60 | 12.35 
2.21 | 3.05 | 424| 5.99| 8.65 |10.08 | 11.81 | 13.90 | 16.46 | 19.61 
88 | 8.64| 8.99 | 9.45 | 10.07 | 10.91 | 11.33 | 11.82 | 12.40 | 13.07 | 13.87 
2.29 | 3.15 | 4.38 | 6.20] 8.93 | 10.45 | 12.24 | 14.43 | 17.09 | 20.39 
-90 | 10.35 | 10.72 | 11.21 | 11.85 | 12.74 | 13.19 | 13.71 14.32 | 15.04 | 15.90 
2.36 | 3.25| 4.53 | 6.41 | 9.27 | 10.83 | 12.70 | 14.97 | 17.76 | 21.20 
92 | 12.90 | 13.29 | 13.79 | 14.47 | 15.41 | 15.89 | 16.45 | 17.10 | 17.87 | 18.79 
2.43 | 3.35| 4.67| 6.63| 9.61 | 11.22 | 13.17 | 15.54 | 18.45 | 22.06 
94 | 17.11 | 17.52 | 18.05 | 18.77 | 19.76 | 20.27 | 20.87 21.56 | 22.39 | 23.38 
2.51 | 3.46| 4.83| 6.86| 9.95 | 11.63 | 13.67 16.14 | 19.18 22.96 
.96 | 25.50 | 25.92 | 26.48 | 27.24 | 28.29 | 28.83 | 29.47 | 30.21 | 31.11 | 32.18 
2.58 | 3.57 | 4.98 | 7.09 | 10.31 | 12.06 | 14.18 | 16.77 | 19.95 | 23.92 
-98 | 50.55 | 51.00 | 51.58 | 52.38 | 53.50 | 54.08 | 54.76 | 55.56 | 56.53 | 57.70 
2.66 | 3.68 | 5.15 | 7.33 | 10.68 | 12.51 | 14.72 | 17.43 | 20.77 | 24.94 


dM MM ER 


TABLE D 353 
TABLE D (continued) 

a Q= 12 

as 
82 84 .86 88 .90 .92 .94 .96 .98 
.50 3.81 | 4.17 | 4.60 5.11 5.73 6.50 7.47 8.75 | 10.48 
10.94 | 13.01 | 15.57 | 18.77 | 22.81 28.01 | 34.86 | 44.19 | 57.52 
335 4.44 | 4.86| 5.36 5.97 6.70 7.62 8.80 | 10.37 | 12.55 
12.34 | 14.71 | 17.64 | 21.33 | 26.03 32.14 | 40.31 | 51.67 | 68.37 
.60 5.17 | 5.66 | 6.24 6.95 7.82 8.91 | 10.34 | 12.27] 15.05 
13.82 | 16.51 | 19.85 | 24.08 | 29.52 | 36.68 | 46.39 | 60.17 | 81.15 
.65 6.02 | 6.59 | 7.27 8.09 9.13 10.43 | 12.16 | 14.55 | 18.10 
15.40 | 18.43 | 22.23 | 27.06 | 33.33 | 41.68 | 53.20] 69.96 | 96.43 
-70 7.05 | 7.71 | 8.50 9.47 | 10.69 12.26 | 14.36 | 17.33 | 21.93 
17.09 | 20.50 | 24.80 | 30.31 | 37.53 | 47.26 | 60.93 | 81.35 | 115.03 
22 7.53 | 8.23 | 9.07 | 10.11 11.41 13,10.| 15:38 | 18.63] 23775. 
17.80 | 21.37 | 25.89 | 31.70 | 39.33 | 49.67 | 64.32 | 86.46 | 123.67 
74 8.06 | 8.80 | 9.70} 10.81 12.21 14.03 | 16.50 | 20.07 | 25.79 
18.53 | 22.28 | 27.02 | 33.14 | 41.21 | 52.21 | 67.92 | 91.94 | 133.13 
16 8.65 | 9.44 | 10.39 | 11.58 | 13.08 15.05 | 17.74 | 21.67 | 28.07 
19.29 | 23.22 | 28.20 | 34.64 | 43.18 | 54.89 | 71.74 | 97.84 | 143.57 
-78 9.32 | 10.15 | 11.17 | 12.44 | 14.06 16.19 | 19.12 | 23.45 | 30.65 
20.08 | 24.19 | 29.42 | 36.21 | 45.25 57.72 | 75.81 | 104.21 | 155.13 
-80 | 10.07 | 10.97 | 12.06 | 13.42 | 15.16 17.47 | 20.67 | 25.46 | 33.62 
20.89 | 25.20 | 30.70 | 37.85 | 47.43 60.71 | 80.17 | 111.13 | 168.03 
-82 | 10.95 | 11.90 | 13.07 | 14.54 | 16.42 18.93 | 2244 | 27.77 | 37.05 
21.74 | 26.25 | 32.03 | 39.58 | 49.72 | 63.89 | 84.84 | 118.67 | 182.51 
-84 | 11.99 | 13.01 | 14.26 | 15.84 | 17.88 20.61 | 24.49 | 30.44 | 41.07 
22.62 | 27.35 | 33.43 | 41.39 | 52.15 67.28 | 89.88 | 126.94 | 198.92 
86 |13.25 | 14.34 | 15.69 | 17.40 | 19.62 | 22.61 | 26.90 | 33.60 | 45.88 
23.54 | 28.51 | 34.90 | 43.31 | 54.74 | 70.92 | 95.34 | 136.07 217.70 
88 | 14.84] 16.01 | 17.47 | 19.32 | 21.75 | 25.05 | 29.83 37.41 | 51.76 
24.50 | 29.72 | 36.45 | 45.35 | 57.50 | 74.85 | 101.31 146.24 | 239.45 
90 |16.94 | 18.20 | 19.79 | 21.81 | 24.47 | 28.13 33.49 | 42.17 | 59.18 
25.52 | 30.99 | 38.10 | 47.52 | 60.47 | 79.10 107.88 | 157.69 | 265.01 
92 119.91 | 21.28 | 23.00 | 25.22 | 28.17 | 32.27 38.35 | 48.42 | 68.94 
26.58 | 32.35 | 39.85 | 49.85 | 63.68 83.77 | 115.18 | 170.75 | 295.67 
-94 |24.59 | 26.09 | 27.98 | 30.43 | 33.72 | 38.35 45.36 | 57.26 | 82.69 
27.72 | 33.79 | 41.73 | 52.38 | 67.20 | 88.93 123.43 | 185.92 | 333.43 
-96 | 33.50 | 35.14 | 37.23 | 39.96 | 43.69 49.001 | 57.25 | 71.71 | 104.54 
28.92 | 35.34 | 43.77 | 55.13 | 71.09 | 94.74 | 132.93 | 204.02 | 381.82 
98 | 59.14 | 60.95 | 63.28 | 66.38 | 70.66 | 76.92 86.92 | 105.34 | 151.06 
3022 | 37.02 | 45.99 | 58.19 | 75.46 | 101.43 | 144.20 226.58 | 448.29 


54 


TABLE D 


TABLE D (continued) 


" Q — 16 
B 
0| .55| .60| .65| .70| .72| 4| .76| .8| .80 
.50 | 1.61) 1.75 | 1.94] 218] 2.50| 2.66] 2.84] 3.05] 3.30| 3.59 
1.14) 1.55] 2.15] 3.04] 4.42] 5.19] 6.11] 7.25] 8.66| 10.42 
535 | 1,90] 2.07] 2.28) 2.56] 2.93] 3.11] 3.31] 3.56] 3.84] 4.18 
1.27) 1.73 | 2.40] 3.41 | 4.95] 5.81] 6.85} 8.14] 9.73] 11.71 
60 | 2.26 | 2.45] 2.69] 3.00) 3.42] 3.63] 3.87] 415| 447 | 4.86 
1.41 | 1.93] 2.67] 3.78] 5.51 | 646 | 7.63} 9.06 | 10.84 | 13.06 
65 | 2.70] 2.92] 3.19] 3.54] 4.02] 4.25] 4.53] 4.85] 521| 5.65 
155) 2.16] 2.95] 4.18] 6.09] 7.15] 8.45} 10.04 | 1201 | 14.49 
10 | 3.27) 3.51) 3.81] 4.21] 4.75] 5.02] 5.33] 5.70! 6.12] 6.62 
1.70 | 2.33) 3.24] 4.60] 6.71 | 7.88] 9.31 | 11.07 | 1326 | 16.01 
72 | 3.55) 3.79) 4.11] 4.53] 5.10| 5.38] 5.71] 6.09 | 654 | 7.07 
1.76 | 241] 3.36 | 4.77| 696 | 8.18 | 9.67 | 11.50 | 13.78 | 16.64 
74 | 3.86) 4.12} 445 | 4.89 | 5.49| 5.79 | 6.13] 654 TOL | 7.57 
1.82 | 2.50| 3.48| 4.95| 7.22] 849 |10.03 | 11:94 | 1431 | 1729 
76 | 4.22) 4.49) 4.84] 5.30] 5.93| 624| 661 7.03 | 7.53) 8.13 
1.88 | 2.59) 3.60) 5.13] 7.49] 8.81 | 10.41 | 1230 14.86 | 17.96 
78 | 4.63) 4.92] 5.29] 5.77] 643 6.76 | 7.15 | 7.60| 8.13] 8.75 
1.95 | 2.68 | 3.73} 5.31] 7.77] 9.13 | 10.80 12.86 | 15.43 | 18.66 
80 | 5.13 | 5.43 | 5.81] 632] 7.01] 7.36 7.77 | 8.4| 8.80] 9.47 
2.01 | 2.77 | 3.86] 5.50] 8.05] 9.46 | 11.20 13.34 | 16.01 | 19.37 
82 | 5.73 | 6.04] 6.44] 6.98 | 7.71 | 8.08 8.51 | 9.01 | 9.59 | 10.30 
2.08 | 2.86 | 4.00 | 5.69| 8.34 | 9.81] 11.61 13.83 | 16.61 | 20.11 
84 | 6.47) 6.79| 7.21 | 7.77| 8.54| 893| 939 9.91 | 10.54 | 11.29 
2.15] 2.96 | 4.13] 5.89 | 8.64 | 10.16 | 12/03 14.34 | 17.23 | 20.87 
86 | 7.40 | 7.74 | 8.19 | 8.77| 9.58 | 10.00 10.47 | 11.04 | 11.70 | 12.49 
2.22 | 3.06 | 4.27 | 6.09 | 8.94 | 10.53 12.47 | 14.87 | 17.88 | 21.67 
88 | 8.64 | 8.99 | 9.46 | 10.08 | 10.93 | 11.37 11.87 | 12.47 | 13.18 | 14.02 
2.29 | 3.16 | 4.41 | 6.30 | 9.26 | 10.90 12.92 | 15.42 | 18.55 | 22.50 
90 | 10.35 | 10.72 | 11.21 | 11.86 | 12.76 | 1323 13.76 | 14.39 | 15.15 | 16.05 
2.36 | 3.26 | 4.56 | 6.52 | 9.59 | 11.29 13.39 | 15.99 | 19.24 | 23.37 
92 | 12.90 | 13.29 | 13.80 | 14.48 | 15.44 | 15.93 16.50 | 17.17 | 17.98 | 18.95 
2.43 | 3.36| 4.71 | 6.74] 9.93 | 11.70 13.88 | 16.58 | 19.97 | 24.27 
94 | 17.11 | 17.52 | 18.06 | 18.78 | 19.79 | 20.31 20.92 | 21.64 | 22.50 | 23.54 
2.51 | 3.47 | 4.87 | 697 | 10.27 | 12.12 | 14.39 17.21 | 20.74 | 25.22 
96 | 25.50 | 25.92 | 26.49 | 27.25 | 28.32 | 28.87 29.52 | 30.29 | 31.22 | 32.34 
2.59 | 3.58 | 5.02 | 7.21 | 10.64 | 12.56 14.92 | 17.85 | 21.54 | 26.23 
98 | 50.55 | 51.00 | 51.59 | 52.39 | 53.53 | 54.12 54.81 | 55.64 | 56.64 | 57.86 
2.66 | 3.69] 5.19} 7.45 | 11.02 | 13.02 15.48 | 18.54 | 22.39 | 27.30 


24 


TABLE D 


TABLE D (continued) 


355 


x Q= 16 
p 

82 84 .86 .88 90 92 .94 .96 .98 

.50 | 3.93| 4.34] 4.84] 5.45 6.22 7.21 8.50 | 10.28 | 12.86 
12.63 | 15.44 | 19.03 | 23.70 | 29.86 | 38.17 | 49.68 | 66.32] 91.96 

395 | 457 | 303| 3:63 6.35 7.25 8.41 9.96 | 12.12 | 15.34 
14.21 | 17.38 | 21.47 | 26.81 | 33.89 | 43.52] 57.05 | 76.98 | 108.66 

.60 | 5.31 | 5.86] 6.53 7.37 8.42 9.79 | 11.64 | 14.26 | 18.30 
15.86 | 19.44 | 24.05 | 30.11 | 38.20 | 49.31 | 65.13 | 88.93 | 128.07 

.65 6.18] 6.81| 7.59| 8.55 9.79 11.41 | 13.61 | 16.80} 21.87 
17.62 | 21.63 | 26.82 | 33.65 | 42.86 | 55.63 | 74.08 | 102.46 | 150.93 

0 | 7.22) 4.95| 885| 9.97 | IL.41 13.33 | 15.97 | 19.87 | 26.31 
19.49 | 23.96 | 29.77 | 37.47 | 47.92 | 62.57 | 84.06 | 117.94 | 178.31 

72 7.71 | 8.48 | 9.43 | 10.62 | 12.16 14.21 | 17.05 | 21.29 | 28.40 
20.27 | 24.93 | 31.01 | 39.09 | 50.08 | 65.54 | 88.39 | 124.79 | 190.85 

74 8.24 | 9.06 | 10.07 | 11.34 | 12.98 15.17 | 18.24 | 22.86 | 30.73 
21.08 | 25.95 | 32.30 | 40.76 | 52.32 | 68.65| 92.95 | 132.07 | 204.49 

76 | 8.84| 9.70 | 10.77 | 12.13 | 13.88 16.23| 19.55 | 24.58 | 33.32 
21.91 | 26.99 | 33.63 | 42.50 | 54.65 | 71.90 | 97.75 | 139.85 | 219.38 

78 9.51 | 10.42 | 11.56 | 13.01 | 14.88 17.41 | 20.99 | 26.50| 36.24 
22.77 | 28.07 | 35.01 | 44.30 | 57.08 | 75.31 | 102.83 | 148.17 | 235.71 

80 | 10.27] 11.24 | 12.46 | 14.00 | 16.01 18.73 | 22.62 | 28.65 | 39.55 
23.65 | 29.18 | 36.45 | 46.19 | 59.62 | 78.90 | 108.21 | 157.11 | 253.74 

82 |11.15 | 12.19 | 13.48 | 15.13 | 17.22 | 2023| 2446 | 31.11 | 43.37 
24,57 | 30.35 | 37.94 | 48.15 | 62.29 | 82.69 | 113.95 | 166.76 | 273.75 

-84 |12.19 | 13.30 | 14.68 | 16.45 | 18.78 | 21.97] 26.59 | 33.93 | 47.80 
25.53 | 31.55 | 39.49 | 50.21 | 65.10 | 86.69 | 120.07 | 177.23 | 296.13 

86 |13.46 | 14.64 | 16.13 | 18.03 | 20.55 | 24.00 | 29.08 | 37.25 | 53.07 
26.52 | 32.81 | 41.13 | 52.37 | 68.06 | 90.96 | 126.65 | 188.66 321.37 

-88 | 15.05 | 16.32 | 17.92 | 19.97 | 22.71 26.49 | 32.08 | 41.23 | 59.43 
27.56 | 34.13 | 42.84 | 54.65 | 71.21 95.51 | 133.75 | 201.22 | 350.15 

-90 | 17.15 | 18.52 | 20.25 | 22.47 | 25.45 | 29.62] 35.83 | 46.17 | 67.39 
28.64 | 35.51 | 44.65 | 57.07 | 74.57 | 100.42 | 141.49 | 215.17 | 383.39 

92 | 20.13 | 21.60 | 23.47 | 25.91 | 29.18 | 33.80] 40.77 | 52.59 | 77.73 
29.79 | 36.98 | 46.56 | 59.65 | 78.18 | 105.73 | 149.99 | 230.84 | 422.46 

94 |24.82 | 26.42 | 28.46 | 31.13 | 34.76 | 39.93 | 47.87 | 61.63 | 92.13 
30.99 | 38.54 | 48.61 | 62.43 | 82.09 | 111.57 | 159.47 | 248.74 | 469.47 

-96 | 33.73 | 35.48 | 37.72 | 40.68 | 44.75 | 50.63| 59.85 | 76.27 114.69 
32.27 | 40.19 | 50.81 | 65.43 | 86.38 | 118.06 | 170.24 269.71 | 528.11 

98 |59.38 | 61.30 | 63.79 | 67.11 | 71.75 | 78.59 89.61 | 110.11 | 162.01 
33.64 | 41.98 | 53.20 | 68.74 | 91.16 | 125.44 182.81 | 295.28 | 605.96 


Glossary of Symbols Frequently Used 


The numbers in parentheses are numbers of sections where the symbols 
are first introduced. English letters are listed first; Greek letters, next; 


and special symbols, last. 


Ay Aas? t Ay? tse 
4, Ai, Ajk 


n 

Qy, o EREE o PEEN 0, 
Pp Pon 

Pepsi spit spe 


Ps Pus yn 

d. ds 

Qi, On. 

Qi, Qj 

Ry, Rie Res Ra 


uda 
T(x, B) 
T, T, Tj. 
V, 


la 

v " " 
hy yy fen 
P5 Xy Hip 


^ 
Ais Ase 


Alternatives or response classes (1.2). 

Gain parameter in the gain-loss form of the event operators 
(1.6). 

Loss parameter in the gain-loss form of the event operators 
(1.6). 

Projection matrix (1.8). 

Expected value of x (4.8). 

Events which alter the response probabilities (1.3). 

Mean number of trials before the first A, or Ag occurrence, 
respectively (9.6). 

Cumulative distribution function (4.2). 

Function given in Table D (9.11). 

Function given in Table D (9.10). 

Function given in Table C (9.10). 

Time increment in runway model (14.2). 

Identity (matrix) operator (1.5). 

Trial index (4.8). 

Trial index (9.4). 

Trial index (3.3). 

Outcomes of response occurrences (1.3). 

Probability of p-value sequence (3.9). 

Response probabilities (1.2). 

Probability vector (1.5). 

Probability of alternative A, (1.6). 

Probability of alternative 4» (1.6). 

Row operator (for p) (1.6). 

Row operator (forg — ! — p (1.6). 

Length of run of A,'s or A's (4.8). 

Mean total number of 4,’s or Ay's, respectively (9.6). 

Function given in Table B (8.2). 


Event operator (matrix) (1.5). 
mth raw moment of p-value distribution (4.2). 


Vector of marginal means (4.4). 

Random variable representing response occurrence (9.4). 

Operator parameter (measures “ineffectiveness” of event) 
(1.6). 

Increment (3.4). 

Fixed point of row operator (1.6). 


357 


C) 


8 IPIRD CAS 


v 


ray} 


ee 
E 
— 


wD! 


GLOSSARY OF SYMBOLS FREQUENTLY USED 


Elements of limit vector (1.8). 

Limit vector for event E; (1.8). 

Limit matrix (1.8). 

Probability of event E, (sometimes an abbreviation for Tk) 
(3.9). 

Conditional probability of outcome Or, given alternative A; 
(3.13). 

Product (4.8). 

Standard deviation, variance (4.2). 

Function given in Table A (4.8). 

Function given in Table A (4.8). 

Chi-square statistic (9.13). 

Measure of the set named in the parentheses (2.1). 

Set sum, join, or union (2.2). 

Set product, meet, or intersection (2.2). 

"Is approximately equal to” (6.4). 

"Estimates" or “is estimated by" (9.6). 

Infinity (3.3). 

Conditional probability of .r, giveny (3.11). 

Absolute value or magnitude of . (6.3). 

Binomial coefficient (NY) ON — ay!) (4.2). 


Mean or average of p, a, ir, etc, (3.9). 
Estimate of p, a, x, etc. (9.6). 


Index 


Absorbing barriers, 154, 166-169, 287 
289 
Alternations, 199 
Alternatives, 14, 44 
Anxiety reduction, 253 
Arkin, H., 152 
Articulation value, 218, 219 
Association theory, 167, 189-191 
Associativity, 24, 25 
Assumptions of general model, com- 
bining classes, 38-43, 54, 331 
event invariance, 308 
homogeneity, 50-53 
independence of path, 17, 20, 78, 
330 
linearity, 20, 44, 331 
Asymptotic distribution, 99, 131-138, 
155, 156, 164-169, 174, 331, 332 
Asymptotic means, 89, 93, 109, 111, 
119, 120, 123, 124, 139-141, 280, 
282, 288, 289, 301-303 
Asymptotic second moment, 90, 114 
Avoidance training, 6-10, 195-197, 
237 ff. 


Bailyn, L., x, 148, 344 

Bar-pressing, 13, 39, 187 

Baseline use of model, 108, 235, 236 
272, 336 

Beggs, J. E., x 

Bekhterev, V. M., 237 

Bellman, R., x, 163, 170 


Best asymptotically normal estimates, 
267 

Bias of estimates, 194, 199, 200, 205, 
225, 231, 246, 319 

Binomial coefficients, 86, 193, 313 

Binomial distribution, 155, 156, 193, 
272 

Binomial expansion, 86 

Binomial observations, 193, 272 


Birkhoff, G., 59, 82 
Birnbaum, A., x, 167, 170 
Blank trials, 281, 282, 301, 306 
Bounds, on means, 128, 141-152, 156- 
162, 165, 174, 175, 180 
on second moments, 145 
on third moments, 148 
Branching diagrams, 70, 79 
Bromwich, T. J. Ta., 170 
Bruner, J. S., 217, 219, 220, 230, 235, 
236 
Brunswik, E.. 68, 69, 82, 106, 115, 
116, 154, 173, 275, 276, 277, 279, 
291, 309 


Burke, C. J., x, 48, 54, 33 

Burros, R. H., 45 

Bush, R. R., xi, 45, 54, 82, 105, 15 
216, 278, 312, 328, 337 


R 


Carver, D., 278 

Chi square, 212-214, 220, 222, 231, 
233, 234, 253, 266-269, 271 

Classical conditioning, 237, 253, 254 

Clayton, F. L., 82 

Closed interval, 33 

Cluster learning, 234 

Cognitive theory, 333 

Colton, R. R., 152 

Column vector, 25 

Combining classes restriction, 38-43, 
54, 331 

Commutativity, 19, 23, 24, 63-66, 171— 
173, 240, 253 

Commutator, 19 

Complement of set, 49 = 

Complementary operator Q; 28, 31, 
44, 107, 240 

Complete learning, 3, 15, 120 


359 


360 


Conditioned stimulus, 237, 238, 253- 
257 
Conditioning, 274, 275, 288 
classical (Pávlovian), 237, 253, 254, 
330 
instrumental, 237, 253, 254 
Consistent estimator, 200 
Contiguity theory, 167, 189-191 
Convergence of series, 100, 101 
Convex union of sets, 99 
Coombs, C. H., x, 45, 105 
Criteria for extinction, 192 
CS-US interval, 238, 256-257 
Cumulative distribution, 87, 131-138, 
320 
Cumulative number of responses, 250, 
251 
Curve fitting, 333-335 


Davis, H. T., 18, 28, 45 
Davis, R. L., x, 45, 105, 278 
Delay in reinforcement, 191, 332 
Detambel, M. H., 278, 279, 300, 307, 
309 
Difference equations, 61, 72, 75, 89, 
110, 114, 126, 139, 180, 181, 183 
Differential equation, 61, 139, 181, 183 
Discrimination, 48, 296, 332 
Distributions, asymptotic, 99, 131-138, 
155, 156, 164-169, 174, 331, 332 
binomial, 155, 156, 193, 272 
chi-square, 214 
cumulative, 87, 131—138, 320 
gamma, 317-320 
initial, 85 
marginal, 92 
multivariate, 92 
negative binomial, 223, 313, 317 
response probability, 6, 70, 71, 79, 
83 ff. 
Divergence of product, 168 
Dogs, 238 ff. 
Dollard, J., 259, 273 
Drive, fear as a, 237 
Drive strength, 191, 332 


Eisenhart, C., 214, 216 

Entwisle, D., xi 

Equal alpha condition, 106 ff., 135, 137, 
279—288, 292, 295, 299, 309 


INDEX 


Ergodic theorem, 99, 133 
Errors, total number of, 176, 197, 225, 
226, 245, 246, 257, 290 
Escape, 237 
Estes, W. K., ix, x, 46, 48, 54, 82, 190, 
254, 312, 313,328, 331, 337 
Estimates, best asymptotically normal, 
267 
consistent, 200 
maximum likelihood, 200-212, 224, 
226-230, 233, 241, 243-245, 267, 
319, 346, 347 
minimum chi-square, 231-234, 266- 
269 
minimum variance, 200, 246 
observed chi-square, 267—269 
simultaneous, 202, 207, 210, 228, 
229 
unbiased, 199, 200, 205, 225, 231, 
246 
variance of, 199, 200, 202, 203, 210- 
212, 222, 224, 228, 245, 248, 249, 
257 
Estimation, general problem, 192-194, 
332, 339, 344 
Estimation, of «'s, 8-10, 195-198, 201, 
203-212, 225-234, 241-249, 257. 
264, 265, 281—284, 288-299, 303, 
304, 325 
of 4, 230-234 
of k, Sj, and cj, 319-324 
of py and V, p, 210, 222, 228, 229, 
266-269, 281, 282, 284, 290, 291, 
293 
of probabilities, 5, 16, 192, 193 
Event invariance, 308 
Event Operators, see Operators 
Events, basic concept of, 1, 17, 44, 188 
experimenter-controlled, 68-73, 81, 
82, 88-93, 108-110, 124-127, 
260-273, 278-282, 306, 309 
experimenter-subject-controlled, 80, 
81, 97, 98, 118-124, 183, 262, 
286-291, 309 
general notion of, 1, 17, 44, 188 
identifications with, 1, 2, 188, 189. 
219, 240, 260, 279, 286, 301, 308, 
325, 330, 331 
Sequences of, 55 ff. 


INDEX 


Events, subject-controlled, 76-80, 93- 


97, 110-118, 153 ff., 171 ff., 218 ff., 
238 ff. 
Expectancy, 308 
PS Operator, 73, 111, 122, 138- 
use 161, 165, 175, 180, 288, 289 
es value, 100-104, 176-179, 182, 
zm 203, 205, 211, 214, 223, 224, 
> 280, 281, 313, 318 


“POPE : 
i99 "^! Problems, 


amount of reward, 191, 332 

avoidance training, 6-10, 195—197, 
237 ff. 

delay in reinforcement, 191, 332 

discrimination, 48, 296, 332 

drive strength, 191, 332 

escape, 237 

extinction, 116, 153, 169, 173, 175, 
191, 192, 295, 297, 303-309 

fixed-ratio reinforcement, 55 

generalization, 48, 254, 332 

imitation, 259 ff. 

intertrial interval, 276 

partial reinforcement, 55, 56, 66, 68, 
274 ff., 310 

position preferences, 4, 6 

punishment, 190 

random-ratio reinforcement, 

Tate of responding, 330 

Tesponse intensity, 330 

reward, amount of, 191, 332 

rote learning, 217 ff. 

runway, 55, 188, 310 ff. 

secondary reinforcement, 287, 332 

Shock intensity, 191, 238 

Skinner box, see Bar-pressing 

Spontaneous recovery, 48, 332 

three-choice situations, 300-303 

time between trials, 276 

T-maze, 4, 5, 68, 69, 76, 106, 115, 
116, 154, 173, 188, 192, 195, 197, 
274-276, 286, 291 ff., 330 

two-armed bandit, 277, 294—300, 
303-307 

two-choice situations, 274 ff. 

verbal learning, 217 ff. 

work, amount of, 191, 332 


alternations, 


56, 68 


361 


Experimenter-controlled events, see 
Events, experimenter-controlled 
Experimenter-subject-controlled events, 

see Events, experimenter-subject- 


controlled 
Extinction, 116, 153, 169, 173, 175, 191, 
192, 295, 297, 303-309 


Fear, 237 
Eeer, W.. 45, 77, 82, 105, 184 


isher, R. A.. $2 " 
Fitch, F. BA 34^ 152, 200. 328 


Fixed point, 30, 60 
Fixed-point form, 29, 36, 44 
Fixed-ratio reinforcement, 55 
Flood, M. M., x, 278 
Frick, F. C., 56, 82 
Functional equation, 163, 164 


Gagné, R. M., 55, 82, 310, 313, 328 
Gain-loss form, 29, 36, 44, 52 
Gamma distribution, 317-320 
General model, 1, 2 
Generalization, stimulus, 48, 254, 332 
Geometric interpretation of parameters, 
30 
Gerbrands, H., 277 
Gestalten, 47 
Girshick, M. A., 225, 236 
Goal-directed movements, 312 ff. 
Goodness-of-fit, chi-square, 212, 213, 
233, 234, 271 
general problems, 5, 215, 216, 285, 
327, 332, 333 
from Monte Carlo runs, 230, 251, 
252 
run test, 214, 215, 251, 271 
U-statistic, 214, 253 
Goodnow, J. J., 277, 294, 295, 296, 
298, 304, 305, 307 
Gossett, W. S., 129 
Graham, C., 55, 82, 310, 313, 328 
Grant, D. A., 276, 278, 282, 309 
Gulliksen, H., viii 
Guthrian theory, 167, 189-191 
Guthrie, E. R., 76, 170, 187, 190, 216, 
333 


362 


Habit strength, 29 

Hake, H. W., 76, 82, 124, 126, 127, 
276, 278, 282, 283, 309 

Hall, M., 236 

Harris, K. S., x 

Harris, T. E., x, 99, 105, 170 

Harvard Computation Laboratory, 339 

Hays, D. G., x, 346, 347 

Hilgard, E. R., 258 

Hoel, P. G., 216 

Homogeneity assumption, 50-53 

Hornseth, J. P., 276, 278, 282, 309 

Hovland, C. I., 236 

Hull, C. L., viii, 29, 45, 190, 216, 236, 
332, 333, 337 

Hullian theory, 189-191 . 

Humphreys, L. G., 276, 278, 282, 309 

Hyman, R., 76, 82, 124, 126, 127, 276, 
278, 283, 309 


Identifications, 1, 2, 187-189, 219, 240, 
260, 279, 286, 301, 308, afis 312, 
325, 330, 331 

Identity matrix, 24 

Identity operator, 18, 172, 180-183, 
219, 282, 300-303, 309 

Imitation experiment, 259 ff, 

Imperfect learning, 230-234, 314 

Independence-of-path assumption, 17, 
20, 78, 330 

Individual differences, 253, 256 

Induction, mathematical, 59 

Inhibition, 173 

Initial distribution of probabilities, 85 

Instrumental conditioning, 237, 253, 
254 

Integers, sum of, 103 

Intensity of response, 330 

Intersection of sets, 49 

Intertrial interval, 276 

Intuition, use of, 189 

Invariance, event, 308 

Invariance rule, probability, 15, 27, 38, 
44 

Tterates, 62 


Jarvik, M. E., 276, 
Jenkins, W, O55 
Join of sets, 49 

Jordan, C., 61, 82 


278, 283, 284, 309 
82, 305, 307, 309 


INDEX 


Karlin, S., x, 98, 105, 331, 337 
Keller, F. S., 258 

Kendall, M. G., 152, 216 
Koopmans, T. C., x 


Latency, 275, 310 ff., 330 
Law of effect, 189 
Law of large numbers, 180 
Learning, cluster, 234 
complete, 3, 15, 120 
general notion of, 3 
imperfect, 230-234, 314 
one-trial, 32 
perfect, 153, 156, 167, 169, 176, 197, 
219 
rote, 217 ff. 
serial, 217 
theories of, 167, 187, 189-191, 237, 
253, 254, 333 
verbal, 217 ff. 
Limit point, 60, 173-179, 332 
Limit vector, 64, 93, 110, 301, 302 
Linear operators, 19-21, 28, 44, 331 
Linearity assumption, 20, 44, 331 


McFann, H., 82 
McGill, W. J., x, 
218, 219, 236 
MacLane, S., 59, 82 
Marginal means, 92, 110, 121, 123 12 
Markov chains, 56, 58, 73-78, 82, 112+ 
124-127, 276, 283 
Marquis, D. G., 258 
Mathematical induction, 59 
Matrices, see also Operators 
addition of, 22, 24 
associativity of, 24, 25 
commutativity of, 23, 24, 63, 65, s 
definition of, 21, 22 
identity, 24 
iterates of, 62 
multiplication of, 23, 24 
powers of, 62 
scalar multiplier, 23, 24 
stochastic, 28, 38, 44 224. 
Maximum likelihood, 200-212. ^67. 
226-230, 233, 241, 243-245. 
319, 346, 347 


158, 181, 184, 217, 


363 


INDEX 


Mea 

an 2 

" asymptotic, 89, 93 
19, 120, 123 124 ` 

ounds on. 128 TT 152 


165 17. 
LA 5 : 
definition. P 180 


explici 
jg formulas 
112. 119, 120 
Pai 289. 303. 
rec Bina, 92 110 131 
"Cürrenee fo Rs. d 
118-130 formulas for, 93, 96 
" 20. 174, 180, 181, 183. 
“sure of Sets ie iol 
85, 48 


behavior, 329, 330 


109, 111, 


156-162, 


for, 89, 91, 93 


175, 251, 280 


109, 


M 


110, 


Casy 

Sures 

: o 

edi; f 
ee 


"U, A. x g 
217, 218 X. xi, 56, 73, 82, 181, 184 
iller, x ^18 319. 398. 530, 235 236 
i «E, 230, 235, 23 
Mler- Moy, » 237, 258, 259, 273 
Nimy cy shuttle box, 238 3 
269 M square, 231-234 ^66 


Min; 
him 
um 
o “Variz s 1 
Mom Vinee estimates, 200, 246 
s ents, p. ating function, 317 
fini: ?unds for 145 148 
recy Aot, 85-87 
+ 85-87 
9 * formul; 
X Mulas for, 88, 91, 94, 


ar] 
195 ig Method, 129, 166, 194 


Maga? * 198, 200, 230, 235, 251 
Moet l 
Osten" M. 216, 2 
2 wg oru 236, 328 
Mag?16. 255 "i. 45. 54, 82, 105, 152. 
riae dict " 278, 312, 328, 337 
A Stribution, 92 US 
Pat; n, 92 
Nein” bin, 
ane E Omial, 223, 313 317 
$1,584 2.93. 105, 110, 278, 279. 


Ne, 303, 357 285, 2 

T >. 299, 30 

Tons 3 6. 307, 309 0, 301, 302, 
+3 


Chi - 

Square, 2 
ürn; > 267—2 
s, jg ne 32 69 


Operators, addition, 18, 20, 21 
commuting, 19, 63-66, 171 ff., 219, 
240, 253 

complementary, 

concept of, 1, 18-21 

deduction of, from set model, 51-54 

expected, 73, 111, 122, 138-141. 
161, 165, 175, 180, 288, 289 

fixed-point form of. 29, 36, 44 

of, 29, 36, 44, 52 

180-183, 219, 282, 


28, 31, 44, 107, 240 


gain-loss form 
identity, 18, 172, 
300-303, 309 
iterates of, 62 
limit points of, 60, 173-179, 332 


linear, 19-21, 28. 44. 331 


matrix, 21-28 

multiplication, 19 

non-linear, 20 

null, 18 

repetitive application of, 58-64 

row, 28, 29. 44 

slope-intercept form of, 29. 36, 44 
Outcomes. 17, 69. 80, 97, 118. 119. 

286-288 


121-124, 183. 261. 


p value. definition, 83 
p values, distributions of, 83 ff. 
Partial reinforcement. ss, 56, 66. 68. 
274 ff., 310 
Path independence, 17, 20. 78, 330 
Pavlovian conditioning. 237, 253. 254, 
3 
Es K. 212. 213, 220, 267, 271. 
328 
Peirce, B. O- 
Percentage points. 
Percentiles. 320-324 
Perfect learning. 153, 
176. 197. 219 
Perkins, D- T.. 236 
Playing-card experiment 
Position preferences. " 


Postman, L- 216 
atrix. 62 


Powers of m 

Practice effect. 219. is 

Pre-absorption. 1 : 

Prediction experiment. 2 

Probabilities. i tributions of, 
79, 83 ff. 


152 
320-324 


362 


Habit strength, 29 

Hake, H. W., 76, 82, 124, 126, 127, 
276, 278, 282, 283, 309 

Hall, M., 236 

Harris, K. S., x 

Harris, T. E., x, 99, 105, 170 

Harvard Computation Laboratory, 339 

Hays, D. G., x, 346, 347 

Hilgard, E. R., 258 

Hoel, P. G., 216 

Homogeneity assumption, 50-53 

Hornseth, J. P., 276, 278, 282, 309 

Hovland, C. I., 236 

Hull, C. L., viii, 29, 45, 190, 216, 236, 
332, 333, 337 

Hullian theory, 189—191 . 

Humphreys, L. G., 276, 278, 282, 309 

Hyman, R., 76, 82, 124, 126, 127, 276, 
278, 283, 309 


Identifications, 1, 2, 187-189, 219, 240, 
260, 279, 286, 301, 308, 31 1, 312 
325, 330, 331 

Identity matrix, 24 

Identity operator, 18, 172, 180-183, 
219, 282, 300-303, 309 

Imitation experiment, 259 ff. 

Imperfect learning, 230-234, 314 

Independence-of-path assumption, 17, 
20, 78, 330 

Individual differences, 253, 256 

Induction, mathematical, 59 

Inhibition, 173 

Initial distribution of probabilities, 85 

Instrumental conditioning, 237, 253, 
254 

Integers, sum of, 103 

Intensity of response, 330 

Intersection of sets, 49 

Intertrial interval, 276 

Intuition, use of, 189 

Invariance, event, 308 

Invariance rule, probability, 15, 27, 38, 
44 

Iterates, 62 


Jarvik, M. E., 276, 278, 283, 284, 309 
Jenkins, W. O., x, 82, 305, 307, 309 
Join of sets, 49 

Jordan, C., 61, 82 


INDEX 


Karlin, S., x, 98, 105, 331, 337 
Keller, F. S., 258 

Kendall, M. G., 152, 216 
Koopmans, T. C., x 


Latency, 275, 310 ff., 330 
Law of effect, 189 
Law of large numbers, 180 
Learning, cluster, 234 
complete, 3, 15, 120 
general notion of, 3 
imperfect, 230—234, 314 
one-trial, 32 
perfect, 153, 156, 167, 169, 176, 197, 
219 
rote, 217 ff. 
serial, 217 
theories of, 167, 187, 189-191, 237, 
253, 254, 333 
verbal, 217 ff. 
Limit point, 60, 173-179, 332 
Limit vector, 64, 93, 110, 301, 302 
Linear operators, 19-21, 28, 44, 331 
Linearity assumption, 20, 44, 331 


McFann, H., 82 
McGill, W. J., x, 
218, 219, 236 
MacLane, S., 59, 82 
Marginal means, 92, 110, 121, 123 
Markov chains, 56, 58, 73-78, 82, 112, 
124-127, 276, 283 
Marquis, D. G., 258 
Mathematical induction, 59 
Matrices, see also Operators 
addition of, 22, 24 
associativity of, 24, 25 
commutativity of, 23, 24, 63, 65, 66 
definition of, 21, 22 
identity, 24 
iterates of, 62 
multiplication of, 23, 24 
powers of, 62 
scalar multiplier, 23, 24 
Stochastic, 28, 38, 44 
Maximum likelihood, 200-212, 224. 
226-230, 233, 241, 243-245, 267. 
319, 346, 347 


158, 181, 184, 217, 


INDEX 


Means, asymptotic, 89, 93, 109, 111, 
119, 120, 123, 124 
bounds on, 128, 141-152, 156-162, 
165, 174, 175, 180 
definition, 85 
explicit formulas for, 89, 91, 93, 109, 
112, 119, 120, 175, 251, 280, 282, 
288, 289, 303 
marginal, 92, 110, 121, 123 
recurrence formulas for, 93, 96, 110, 
118—120, 174, 180, 181, 183 
Measure of sets, 47, 48 
Measures of behavior, 329, 330 
Median, 324 
Meet of sets, 49 
Memory, 17, 330 
Miller, G. A., x, xi, 56, 73. 82, 181, 184, 
217, 218, 219, 220, 230, 235, 236 
Miller, N. E., 237, 258, 259, 273 
Miller-Mowrer shuttle box, 238 
Minimum chi square, 231-234, 266- 
269 
Minimum-variance estimates, 200, 246 
Moment generating function, 317 
Moments, bounds for, 145-148 
definition of, 85-87 
recurrence formulas for, 88, 91, 94. 
98 
Monte Carlo method, 129. 166, 194, 
195, 197, 198, 200, 230. 235, 251. 
252 
Mood, A. M., 216, 236, 328 
Mosteller, F., xi, 45, 54, 82, 
216, 225, 236, 278, 312, 328, 337 
Multivariate distribution, 92 


Negative binomial, 223, 313, 317 

Neimark, E. D., 93, 105, 110, 278, 279. 
281, 284, 285, 299, 300, 301. 302, 
303, 306, 307, 309 

Neurons 315, 316 

Newman, E. B., 76, 82 

Neyman, J., 267, 273 

Non-linear operators, 20 

Nonsense syllables, 217 


Observed chi square, 267-269 
One-trial learning, 32 
Operands, 18 


363 


Operators, addition, 18, 20, 21 

commuting, 19, 63-66, 171 ff., 219, 
240, 253 

complementary, 28, 31, 44, 107, 240 

concept of, 1, 18-21 

deduction of, from set model, 51—54 

expected, 73, 111, 122, 138-141, 
161, 165, 175, 180, 288, 289 

fixed-point form of, 29, 36, 44 

gain-loss form of, 29, 36, 44, 52 

identity, 18, 172, 180—183, 219, 282, 
300-303, 309 

iterates of, 62 

limit points of, 60, 173-179, 332 

linear, 19-21, 28, 44, 331 

matrix, 21-28 

multiplication, 19 

non-linear, 20 

null, 18 

repetitive application of, 58-64 

row, 28, 29, 44 

slope-intercept form of, 29, 36, 44 

Outcomes, 17, 69, 80, 97, 118, 119, 

121-124, 183, 261, 286-288 


p value, definition, 83 

p values, distributions of, 83 ff. 

Partial reinforcement, 55, 56, 66, 68, 
274 ff., 310 

Path independence, 17, 20, 

Pavlovian conditioning, 237. 253, 254, 
330 

Pearson, K., 212, 213, 220, 267, 271, 
328 

Peirce, B. O., 152 

Percentage points, 320-324 

Percentiles, 320-324 

Perfect learning, 153, 156, 167, 169, 
176. 197, 219 

Perkins, D. T., 236 

Playing-card experiment, 278, 298-300 

Position preferences, 4, 6 

Postman, L., 216 

Powers of matrix, 62 

Practice effect, 219, 330 

Pre-absorption, 168, 169 

Prediction experiment, 276, 282-284 

Probabilities, distributions of, 6, 70, 71, 
79, 83 ff. 


364 


Probabilities, estimates of, 5, 16, 192, 
193 
factors which change, 17 
heuristic devices for, 15, 16 
invariance rule for, 15, 27, 38, 44 
vector of, 26, 27 
Punishment, 190 


Radner, R., x 
Random numbers, 129-131 
Random ratio reinforcement, 56, 68 
Random variables, 100, 104, 262, 263, 
272, 280, 315, 317 
Rashevsky, N., viii 
Rate of responding, 330 
Recency effect, 171 
Recursive formulas, 60, 88, 91-98, 110, 
118-120, 174, 180, 181, 183 
Reflecting barriers, 154, 169 
Reinforcement, fixed ratio, 55 
partial, 55, 56, 66, 68, 274 ff.. 310 
random ratio, 56, 68 
secondary, 287, 332 
Reinforcement theory, 189-191 
Response classes, 13-17 
Response intensity, 330 
Restrictions on parameters, 33-36, 43, 
44 
Reward, amount of, 191, 332 
Robillard, L., 277, 296, 297, 298, 304, 
305, 306, 307 
Ross, R. T., 236 
Rote learning, 217 ff. 
Row operator, 28, 29, 44 
Row vector, 25 
Run test, 214, 215, 251, 271 
Running time, 310 
Runs, length of, 100-104, 332, 339 
mean length of, 100-104, 177, 214 
variance of length of, 102-104, 215 
Runway, 55, 188, 310 ff. 


Samoan language, 76 

Savage, L. J., 43, 44, 200, 216, 225, 
236, 331 

Scalar multiplication of matrices, 23, 
24 

Schein, E., 259, 273, 276 

Schoenfeld, W. N., 258 

Schonert, V. L., xi 


INDEX 


Second moments, asymptotic, 90, 114 
bounds on, 145 
explicit formulas for, 90, 92, 114 
recurrence formulas for, 96, 114 
Secondary reinforcement, 287, 332 
Sequences of events, 55 ff. 
Sequential dependence, 56-58 
Serial learning, 217 
Set, psychological, 308 
Sets, complement of, 49 
convex union of, 99 
disjunct, 48 
empty, 49 
intersection of, 49 
join of, 49 
measure of, 47, 48 
meet of, 49 
product of, 49 
subsets of, 48 
sum of, 48, 49 
union of, 48, 49, 99 
Shapiro, H. N., x, 163, 170 
Shock intensity, 191, 238 
Shwartz, N., 259, 260, 261, 262. 
264, 265, 269, 270, 271, 273 
Similarity index, 254 
Simultaneous estimates of parameters, 
202, 207, 210, 228, 229 
Skinner, B. F., 333 
Skinner box, see Bar-pressing 
Slope-intercept form, 29, 36, 44 
Smith, M. B., 152 
Snedecor, G. W., 152 
Solomon, R. L., 6, 7, 9, 10, 177, 197. 
238, 239, 240, 250, 251, 252, 253. 
254, 256, 257, 258 
Spence, K. W., 190, 216 
Spontaneous recovery, 48, 332 
Stanley, J. C., Jr., 82, 275, 276, 277. 
279, 291, 292, 293, 294, 305, 307. 
309 
Stat-dogs, 251, 252. 
State of pre-absorption, 168, 169 
States of Markov chains, 77 
Statistics, of data, 7-10, 195-199 
relation of model to, 2, 3 
Stat-rats, 129-138, 194, 195, 197, 198» 
200, 291 
Stevens, S. S., 216, 236, 258 


INDEX 


Stimulus sampling and conditioning, 
46 ff. 

Stochastic matrix, 28, 38, 44 

Stochastic processes, 3, 20 

Stouffer, S. A., v 

Student, 129, 152 

Subject-controlled events, see Events, 
subject-controlled 

Subsets, 48 

Sum of integers, 103 

Swed, F. S., 214, 216 

Symmetry in experiments, 106, 274 ff. 


Theorem, ergodic, 99, 133 
trapping, 98, 99 
Theory, association, 167, 189-191 
cognitive, 333 
contiguity, 167, 189—191 
Guthrian, 167, 189-191 
Hullian, 189-191 
reinforcement, 189-191 
two-factor, 237, 253, 254 
Third moment, bounds on, 148 
Thompson, G. L., x, 40, 45, 105, 278, 
331, 337 
Thorndike, E. L., 189 
Thrall, R. M., x, 45, 105 
Three-choice situations, 300-303 
Thurstone, L. L., viii, 21, 45 
Time between trials, 276 
Time series, 58 
Tippett, L. H. C., 152 
T-maze, 4, 5, 68, 69, 76, 106, 115, 116, 
154, 173, 188, 192, 195, 197, 274- 
276, 286, 291 ff., 330 
Trapping theorems, 98, 99 
Tree diagrams, 70, 79 
Trials, before first occurrence of a re- 
sponse, 177, 178, 182, 195, 196, 
242, 257 
before second occurrence of a re- 
sponse, 179, 182, 196 
definition of, 14, 44, 188, 311, 312, 
329, 330 
Tukey, J. W., x 


365 


Two-armed bandit, 277, 294-300, 303— 
307 

Two-choice situations, 274 ff. 

Two-factor theory, 237, 253, 254 


Unbiased estimates, 199, 200, 205, 225, 
231, 246 

Unconditioned stimulus, 237, 253-257 

Union of sets, 48, 49, 99 

Uspensky, J. V., 45 

U-statistic, 214, 253 


Variances, of binomial, 272 
of estimates, 199, 200, 202, 203, 
210-212, 222, 224, 228, 245, 248, 
249, 257 
of gamma, 318 
of latency distribution, 314, 318 
of negative binomial, 314 
of p-value distributions, 109, 182 
of run lengths, 102-104, 215 
Vector, column, 25 
limit, 64, 93, 110, 301, 302 
marginal mean, 92, 110, 121, 123 
probability, 26, 27 
row, 25 
Verbal learning, 217 ff. 


Weinstock, S., x, 310, 312, 313, 319, 
321, 323, 325, 327, 328 

Weizenbaum, J., x 

Wilks, S. S., 216 

Wilson, T. R., xi, 346 

Withdrawal responses, 190, 237 

Wolfle, D. L., viii 

Work, amount of, 191, 332 

Wynne, L. C., 6, 7, 9, 10, 177, 197, 238, 
239, 240, 250, 251, 252, 253, 254, 
256, 257, 258 


Yates, F.. 152,212 
Youtz, C., xi, 344 


Zeaman, D., x 
Zimmerman, C., 217, 219, 220, 230, 
235, 236 


^7 * 


"c 
Ld A 
* 
. 
. 
$ 
. r 
* e 
. 
rd A 
. 
i* 
E 
t 
. 
” 
LI 


Form No. 3, 
PSY, RES.L-1 


Bureau of Educational & Psychological 
Research Library. 


The book is to be returned within 
the date stamped last. 


D UU E À— — EE 
WBGP-59/60-5119C-5M 


d ar . Ii 
" 2 m 
å M" $ JR 
> . 
! . " LI * 
* » ' 
t 
— - —3 
` ^ 
t ‘ ` 


Form No. 4 
4 BOOK CARD 


Acen, No.... 
Mieres 


Pam nee oe wears ale 
sete : 


